npm - prism-mcp-server - Versions diffs - 7.8.6 → 7.8.8 - Mend

prism-mcp-server 7.8.6 → 7.8.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +7 -4
package/dist/dashboard/ui.js +72 -6
package/dist/utils/llm/adapters/ollama.js +153 -0
package/dist/utils/llm/factory.js +19 -9
package/dist/utils/llm/provider.js +2 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -691,9 +691,9 @@ The Generator strips the `console.log`, resubmits, and the next `EVALUATE` retur
 ## 🆕 What's New
-> **Current release: v7.8.2 — Cognitive Architecture**
+> **Current release: v7.8.7 — Cognitive Architecture**
-- 🧠 **v7.8.0 — Cognitive Architecture:** The biggest leap forward yet. Moved beyond flat vector search into a true cognitive architecture inspired by human brain mechanics. Episodic-to-Semantic memory consolidation (Hebbian learning), ACT-R Spreading Activation with multi-hop causal reasoning, Uncertainty-Aware Rejection Gate (your agent can say "I don't know"), and Dynamic Fast Weight Decay (semantic memories outlive episodic chatter by 2×). **Your agents don't just remember; they learn.** → [Cognitive Architecture](#-cognitive-architecture-v78)
+- 🧠 **v7.8.x — Cognitive Architecture:** The biggest leap forward yet. Moved beyond flat vector search into a true cognitive architecture inspired by human brain mechanics. Episodic-to-Semantic memory consolidation (Hebbian learning), ACT-R Spreading Activation with multi-hop causal reasoning, Uncertainty-Aware Rejection Gate (your agent can say "I don't know"), and Dynamic Fast Weight Decay (semantic memories outlive episodic chatter by 2×). Validated by **LoCoMo-Plus benchmark** (arXiv 2602.10715) with Precision@K and MRR metrics. **Your agents don't just remember; they learn.** → [Cognitive Architecture](#-cognitive-architecture-v78)
 - 🌐 **v7.7.0 — Cloud-Native SSE Transport:** Full unauthenticated and authenticated Server-Sent Events MCP support for seamless network deployments.
 - 🩺 **v7.5.0 — Intent Health Dashboard + Security Hardening:** Real-time 0–100 project health scoring (staleness × TODO load × decisions). 10 XSS injection vectors patched. Algorithm hardened with NaN guards and score ceiling.
 - ⚔️ **v7.4.0 — Adversarial Evaluation:** Split-brain anti-sycophancy pipeline. Generator and evaluator in isolated roles with evidence-bound findings.
@@ -968,6 +968,7 @@ Prism is a **stdio-based MCP server** that manages persistent agent memory. Here
 │  │  • ACT-R Spreading Activation (multi-hop)         │  │
 │  │  • Episodic → Semantic Consolidation (Hebbian)    │  │
 │  │  • Uncertainty-Aware Rejection Gate               │  │
+│  │  • LoCoMo-Plus Benchmark Validation               │  │
 │  │  • Dynamic Fast Weight Decay (dual-rate)          │  │
 │  │  • HDC Cognitive Routing (XOR binding)            │  │
 │  └──────┬─────────────────────────────────────────────┘  │
@@ -1056,16 +1057,18 @@ Prism has evolved from smart session logging into a **cognitive memory architect
 | **v7.8** | Multi-Hop Causal Reasoning — spreading activation traverses `caused_by`/`led_to` edges with damped fan effect (`1/ln(fan+e)`) and lateral inhibition | ACT-R spreading activation (Anderson), Collins & Loftus (1975) | ✅ Shipped |
 | **v7.8** | Uncertainty-Aware Rejection Gate — dual-signal (similarity floor + gap distance) safety layer prevents hallucination from low-confidence retrievals | Metacognition research, uncertainty quantification | ✅ Shipped |
 | **v7.8** | Dynamic Fast Weight Decay — `is_rollup` semantic nodes decay 50% slower (`ageModifier = 0.5`) than episodic entries, creating Long-Term Context anchors | ACT-R base-level activation with differential decay rates | ✅ Shipped |
+| **v7.8** | LoCoMo Benchmark Harness — deterministic integration suite (`tests/benchmarks/locomo.ts`, 20 assertions) benchmarking multi-hop compaction structures via `MockLLM` | Long-Context Memory evaluation (cognitive benchmarking) | ✅ Shipped |
+| **v7.8** | LoCoMo-Plus Benchmark — 16-assertion suite (`tests/benchmarks/locomo-plus.ts`) adapted from arXiv 2602.10715 validating cue–trigger semantic disconnect bridging via graph traversal and Hebbian consolidation; reports Precision@1/3/5/10 and MRR | LoCoMo-Plus (Li et al., ARR 2026), cue–trigger disconnect research | ✅ Shipped |
 | **v7.x** | Affect-Tagged Memory — sentiment shapes what gets recalled | Affect-modulated retrieval (neuroscience) | 🔭 Horizon |
 | **v8+** | Zero-Search Retrieval — no index, no ANN, just ask the vector | Holographic Reduced Representations | 🔭 Horizon |
-> Informed by Anderson's ACT-R (Adaptive Control of Thought—Rational), Collins & Loftus spreading activation networks (1975), Kanerva's SDM (1988), Hebb's learning rule, and LeCun's "Why AI Systems Don't Learn" (Dupoux, LeCun, Malik).
+> Informed by Anderson's ACT-R (Adaptive Control of Thought—Rational), Collins & Loftus spreading activation networks (1975), Kanerva's SDM (1988), Hebb's learning rule, Li et al. LoCoMo-Plus (ARR 2026), and LeCun's "Why AI Systems Don't Learn" (Dupoux, LeCun, Malik).
 ---
 ## 📦 Milestones & Roadmap
-> **Current: v7.8.2** — Cognitive Architecture ([CHANGELOG](CHANGELOG.md))
+> **Current: v7.8.7** — Cognitive Architecture ([CHANGELOG](CHANGELOG.md))
 | Release | Headline |
 |---------|----------|

package/dist/dashboard/ui.js CHANGED Viewed

@@ -1253,7 +1253,9 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
               onchange="onEmbeddingProviderChange(this.value)">
               <option value="auto">🔄 Auto (same as Text Provider)</option>
               <option value="gemini">🔵 Gemini</option>
-              <option value="openai">🟢 OpenAI / Ollama</option>
+              <option value="openai">🟢 OpenAI</option>
+              <option value="voyage">🔮 Voyage AI</option>
+              <option value="ollama">🟠 Ollama (Local)</option>
             </select>
           </div>
@@ -1261,7 +1263,7 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
           <div id="anthropic-embed-warning" style="display:none;margin-top:0.5rem;padding:0.5rem 0.75rem;background:rgba(251,146,60,0.1);border:1px solid rgba(251,146,60,0.3);border-radius:6px;font-size:0.78rem;color:#fb923c;line-height:1.5">
             ⚠️ <strong>Anthropic has no native embedding API.</strong>
             Auto mode will route embeddings to <strong>Gemini</strong>.
-            Set Embedding Provider to <strong>OpenAI / Ollama</strong> to use a local model (e.g. <code>nomic-embed-text</code>).
+            Set Embedding Provider to <strong>Ollama (Local)</strong> for free local embeddings, or <strong>Voyage AI</strong> for the Anthropic-recommended cloud pairing.
           </div>
           <!-- OpenAI embedding model field (shown when embedding_provider = openai) -->
@@ -1269,7 +1271,7 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
             <div class="setting-row">
               <div>
                 <div class="setting-label">Embedding Model</div>
-                <div class="setting-desc">Must output 768 dims. Ollama: nomic-embed-text · OpenAI: text-embedding-3-small</div>
+                <div class="setting-desc">Must output 768 dims. Default: text-embedding-3-small</div>
               </div>
               <input type="text" id="input-openai-embedding-model"
                 placeholder="text-embedding-3-small"
@@ -1279,9 +1281,61 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
             </div>
           </div>
+          <!-- Voyage AI embedding fields (shown when embedding_provider = voyage) -->
+          <div id="embed-fields-voyage" style="display:none">
+            <div class="setting-row">
+              <div>
+                <div class="setting-label">Voyage API Key</div>
+                <div class="setting-desc">Get one free at <a href="https://dash.voyageai.com" target="_blank" style="color:var(--accent)">dash.voyageai.com</a></div>
+              </div>
+              <input type="password" id="input-voyage-api-key"
+                placeholder="pa-…"
+                style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 180px;"
+                onchange="saveBootSetting('VOYAGE_API_KEY', this.value)"
+                oninput="clearTimeout(this._pv); var self=this; this._pv=setTimeout(function(){saveBootSetting('VOYAGE_API_KEY',self.value)},800)" />
+            </div>
+            <div class="setting-row">
+              <div>
+                <div class="setting-label">Voyage Model</div>
+                <div class="setting-desc">voyage-code-3 (code) · voyage-3 (general). Both MRL → 768 dims.</div>
+              </div>
+              <input type="text" id="input-voyage-model"
+                placeholder="voyage-code-3"
+                style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 160px;"
+                onchange="saveBootSetting('voyage_model', this.value)"
+                oninput="clearTimeout(this._pvm); var self=this; this._pvm=setTimeout(function(){saveBootSetting('voyage_model',self.value)},800)" />
+            </div>
+          </div>
+          <!-- Ollama embedding fields (shown when embedding_provider = ollama) -->
+          <div id="embed-fields-ollama" style="display:none">
+            <div class="setting-row">
+              <div>
+                <div class="setting-label">Ollama Base URL</div>
+                <div class="setting-desc">Where Ollama is running locally</div>
+              </div>
+              <input type="text" id="input-ollama-base-url"
+                placeholder="http://localhost:11434"
+                style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 220px;"
+                onchange="saveBootSetting('ollama_base_url', this.value)"
+                oninput="clearTimeout(this._pou); var self=this; this._pou=setTimeout(function(){saveBootSetting('ollama_base_url',self.value)},800)" />
+            </div>
+            <div class="setting-row">
+              <div>
+                <div class="setting-label">Embedding Model</div>
+                <div class="setting-desc">Must output 768 dims. <code>nomic-embed-text</code> recommended.</div>
+              </div>
+              <input type="text" id="input-ollama-model"
+                placeholder="nomic-embed-text"
+                style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 180px;"
+                onchange="saveBootSetting('ollama_model', this.value)"
+                oninput="clearTimeout(this._pom); var self=this; this._pom=setTimeout(function(){saveBootSetting('ollama_model',self.value)},800)" />
+            </div>
+          </div>
           <div style="margin-top:1rem;padding:0.6rem 0.8rem;background:rgba(139,92,246,0.08);border:1px solid rgba(139,92,246,0.2);border-radius:6px;font-size:0.78rem;color:var(--text-secondary);line-height:1.5">
-            💡 <strong>Cost-optimized setup:</strong> Text Provider → <code>Anthropic</code>, Embedding Provider → <code>OpenAI / Ollama</code>.<br>
-            Use Claude 3.5 Sonnet for reasoning &amp; <code>nomic-embed-text</code> (free, local) for embeddings.
+            💡 <strong>Zero-cost setup:</strong> Text Provider → <code>Anthropic</code>, Embedding Provider → <code>Ollama (Local)</code>.<br>
+            Use Claude for reasoning &amp; <code>nomic-embed-text</code> (free, local, 768-dim native) for embeddings.
           </div>
           <span class="setting-saved" id="savedToastProviders">Saved ✓</span>
@@ -3135,8 +3189,10 @@ function onTextProviderChange(value) {
 // Called when the EMBEDDING provider dropdown changes.
 function onEmbeddingProviderChange(value) {
     var textVal = document.getElementById('select-text-provider').value;
-    // Show the OpenAI embedding model field only when embedding=openai
+    // Show provider-specific fields based on the selected embedding provider
     document.getElementById('embed-fields-openai').style.display = value === 'openai' ? '' : 'none';
+    document.getElementById('embed-fields-voyage').style.display = value === 'voyage' ? '' : 'none';
+    document.getElementById('embed-fields-ollama').style.display = value === 'ollama' ? '' : 'none';
     refreshAnthropicWarning(textVal, value);
     saveBootSetting('embedding_provider', value);
 }
@@ -3174,7 +3230,17 @@ function loadAiProviderSettings() {
                     if (embedSel)
                         embedSel.value = embedProvider;
                     document.getElementById('embed-fields-openai').style.display = embedProvider === 'openai' ? '' : 'none';
+                    document.getElementById('embed-fields-voyage').style.display = embedProvider === 'voyage' ? '' : 'none';
+                    document.getElementById('embed-fields-ollama').style.display = embedProvider === 'ollama' ? '' : 'none';
                     refreshAnthropicWarning(textProvider, embedProvider);
+                    var vKey = document.getElementById('input-voyage-api-key');
+                    if (vKey) vKey.placeholder = s.VOYAGE_API_KEY ? '(key saved — paste to update)' : 'pa-…';
+                    var vMod = document.getElementById('input-voyage-model');
+                    if (vMod && s.voyage_model) vMod.value = s.voyage_model;
+                    var olUrl = document.getElementById('input-ollama-base-url');
+                    if (olUrl && s.ollama_base_url) olUrl.value = s.ollama_base_url;
+                    var olMod = document.getElementById('input-ollama-model');
+                    if (olMod && s.ollama_model) olMod.value = s.ollama_model;
                     gKey = document.getElementById('input-google-api-key');
                     if (gKey)
                         gKey.placeholder = s.GOOGLE_API_KEY ? '(key saved — paste to update)' : 'AIza…';

package/dist/utils/llm/adapters/ollama.js ADDED Viewed

@@ -0,0 +1,153 @@
+/**
+ * Ollama Adapter (v1.0 — nomic-embed-text)
+ * ─────────────────────────────────────────────────────────────────────────────
+ * PURPOSE:
+ *   Implements LLMProvider using Ollama's native /api/embed REST endpoint for
+ *   fully local, zero-cost text embeddings. No API key required — Ollama runs
+ *   on localhost.
+ *
+ * TEXT GENERATION:
+ *   This adapter is embeddings-only. generateText() throws an explicit error.
+ *   Set text_provider separately (anthropic, openai, or gemini).
+ *
+ * EMBEDDING DIMENSION PARITY (768 dims):
+ *   Prism's SQLite (sqlite-vec) and Supabase (pgvector) schemas define
+ *   embedding columns as EXACTLY 768 dimensions.
+ *
+ *   nomic-embed-text natively outputs 768 dims — zero truncation needed.
+ *   It is the recommended default local model for Prism.
+ *
+ * SUPPORTED MODELS (all confirmed 768-dim via Ollama):
+ *   nomic-embed-text     — 768 dims, 274MB, best quality/size trade-off ✅ DEFAULT
+ *   nomic-embed-text:v1.5 — 768 dims, 274MB, same (stable alias)
+ *
+ *   Models to AVOID with this adapter (wrong dim count):
+ *   mxbai-embed-large    — 1024 dims ❌  (use OpenAIAdapter instead)
+ *   all-minilm           — 384 dims  ❌
+ *   snowflake-arctic-embed — varies  ❌
+ *
+ * BATCH EMBEDDINGS:
+ *   Uses /api/embed (plural) which is the official Ollama batch endpoint
+ *   introduced in Ollama ≥ 0.3.0. Falls back gracefully for older versions.
+ *
+ * CONFIG KEYS (Prism dashboard "AI Providers" tab OR environment variables):
+ *   ollama_base_url   — Base URL of Ollama server (default: http://localhost:11434)
+ *   ollama_model      — Embedding model (default: nomic-embed-text)
+ *
+ * USAGE:
+ *   In the Prism dashboard, set:
+ *     embedding_provider = ollama
+ *   Optionally set ollama_base_url and ollama_model to override defaults.
+ *
+ * API REFERENCE:
+ *   https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embeddings
+ */
+import { getSettingSync } from "../../../storage/configStorage.js";
+import { debugLog } from "../../logger.js";
+// ─── Constants ────────────────────────────────────────────────────────────────
+// Must match Prism's DB schema (sqlite-vec and pgvector column sizes).
+const EMBEDDING_DIMS = 768;
+// Generous character cap — nomic-embed-text has an 8192-token context window.
+const MAX_EMBEDDING_CHARS = 8000;
+const DEFAULT_BASE_URL = "http://localhost:11434";
+const DEFAULT_MODEL = "nomic-embed-text";
+// Connection retry settings — handles the common "forgot to start Ollama" race.
+const MAX_RETRIES = 2;
+const RETRY_DELAY_MS = 500;
+// ─── Adapter ─────────────────────────────────────────────────────────────────
+export class OllamaAdapter {
+    baseUrl;
+    model;
+    constructor() {
+        this.baseUrl = getSettingSync("ollama_base_url", DEFAULT_BASE_URL).replace(/\/$/, "");
+        this.model = getSettingSync("ollama_model", DEFAULT_MODEL);
+        debugLog(`[OllamaAdapter] Initialized — baseUrl=${this.baseUrl}, model=${this.model}`);
+    }
+    // ─── Text Generation (Not Supported) ────────────────────────────────────
+    async generateText(_prompt, _systemInstruction) {
+        throw new Error("OllamaAdapter does not support text generation. " +
+            "Set text_provider to 'anthropic', 'openai', or 'gemini' in the dashboard.");
+    }
+    // ─── Batch Embedding Generation ─────────────────────────────────────────
+    async generateEmbeddings(texts) {
+        if (!texts || texts.length === 0)
+            return [];
+        const model = this.model;
+        // Word-safe truncation — consistent with Voyage and OpenAI adapters.
+        const truncatedTexts = texts.map(text => {
+            if (text.length > MAX_EMBEDDING_CHARS) {
+                const cut = text.slice(0, MAX_EMBEDDING_CHARS);
+                const lastSpace = cut.lastIndexOf(" ");
+                return lastSpace > 0 ? cut.slice(0, lastSpace) : cut;
+            }
+            return text;
+        });
+        debugLog(`[OllamaAdapter] generateEmbeddings — model=${model}, count=${truncatedTexts.length}`);
+        // Retry loop — catches ECONNREFUSED when Ollama service hasn't started yet.
+        let response;
+        let lastError = null;
+        for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
+            try {
+                response = await fetch(`${this.baseUrl}/api/embed`, {
+                    method: "POST",
+                    headers: { "Content-Type": "application/json" },
+                    body: JSON.stringify({ model, input: truncatedTexts }),
+                });
+                if (!response.ok) {
+                    const errorText = await response.text().catch(() => "unknown error");
+                    throw new Error(`[OllamaAdapter] /api/embed request failed — status=${response.status}: ${errorText}. ` +
+                        `Make sure Ollama is running (ollama serve) and '${model}' has been pulled (ollama pull ${model}).`);
+                }
+                // Success — break out of retry loop.
+                lastError = null;
+                break;
+            }
+            catch (err) {
+                lastError = err instanceof Error ? err : new Error(String(err));
+                const isNetworkError = lastError.message.includes("ECONNREFUSED") ||
+                    lastError.message.includes("fetch failed") ||
+                    lastError.message.includes("ECONNRESET");
+                if (isNetworkError && attempt < MAX_RETRIES) {
+                    debugLog(`[OllamaAdapter] Connection failed (attempt ${attempt + 1}/${MAX_RETRIES + 1}): ` +
+                        `${lastError.message.substring(0, 80)}. Retrying in ${RETRY_DELAY_MS}ms...`);
+                    await new Promise(resolve => setTimeout(resolve, RETRY_DELAY_MS));
+                    continue;
+                }
+                throw lastError;
+            }
+        }
+        if (lastError)
+            throw lastError;
+        const data = (await response.json());
+        const embeddings = data?.embeddings;
+        if (!Array.isArray(embeddings) || embeddings.length === 0) {
+            throw new Error(`[OllamaAdapter] Empty embeddings response from model '${model}'.`);
+        }
+        if (embeddings.length !== texts.length) {
+            throw new Error(`[OllamaAdapter] Response length mismatch — expected ${texts.length}, got ${embeddings.length}.`);
+        }
+        // Validate dimensions and slice if model returned > 768 (shouldn't happen
+        // with nomic-embed-text but guards against model swaps).
+        return embeddings.map((emb, i) => {
+            if (emb.length > EMBEDDING_DIMS) {
+                debugLog(`[OllamaAdapter] Embedding[${i}] has ${emb.length} dims — truncating to ${EMBEDDING_DIMS}. ` +
+                    `Consider using a model that natively outputs ${EMBEDDING_DIMS} dims (e.g. nomic-embed-text).`);
+                return emb.slice(0, EMBEDDING_DIMS);
+            }
+            if (emb.length !== EMBEDDING_DIMS) {
+                throw new Error(`[OllamaAdapter] Dimension mismatch at index ${i}: expected ${EMBEDDING_DIMS}, ` +
+                    `got ${emb.length}. Model '${model}' is not compatible with Prism's 768-dim schema. ` +
+                    `Use nomic-embed-text which natively outputs 768 dims.`);
+            }
+            return emb;
+        });
+    }
+    // ─── Single Embedding (delegates to batch) ───────────────────────────────
+    async generateEmbedding(text) {
+        if (!text || !text.trim()) {
+            throw new Error("[OllamaAdapter] generateEmbedding called with empty text.");
+        }
+        const results = await this.generateEmbeddings([text]);
+        return results[0];
+    }
+}

package/dist/utils/llm/factory.js CHANGED Viewed

@@ -1,5 +1,5 @@
 /**
- * LLM Provider Factory (v4.5 — Voyage AI Embedding Support)
+ * LLM Provider Factory (v4.6 — Ollama Local Embedding Support)
  * ─────────────────────────────────────────────────────────────────────────────
  * PURPOSE:
  *   Single point of resolution for the active LLMProvider.
@@ -11,7 +11,7 @@
  *   Two independent settings control text and embedding routing:
  *
  *   text_provider      — "gemini" (default) | "openai" | "anthropic"
- *   embedding_provider — "auto" (default)   | "gemini" | "openai" | "voyage"
+ *   embedding_provider — "auto" (default)   | "gemini" | "openai" | "voyage" | "ollama"
  *
  *   When embedding_provider = "auto":
  *     * If text_provider is gemini or openai → use same provider for embeddings
@@ -24,8 +24,10 @@
  *   text_provider=openai,    embedding_provider=auto   → OpenAI+OpenAI
  *   text_provider=anthropic, embedding_provider=auto   → Claude+Gemini (auto-bridge)
  *   text_provider=anthropic, embedding_provider=voyage → Claude+Voyage (Anthropic-recommended)
- *   text_provider=anthropic, embedding_provider=openai → Claude+Ollama (cost-optimized)
+ *   text_provider=anthropic, embedding_provider=openai → Claude+OpenAI cloud embeddings
+ *   text_provider=anthropic, embedding_provider=ollama → Claude+Ollama (fully local, zero-cost)
  *   text_provider=gemini,    embedding_provider=voyage → Gemini+Voyage (mixed)
+ *   text_provider=gemini,    embedding_provider=ollama → Gemini+Ollama (hybrid cloud/local)
  *
  * SINGLETON + GRACEFUL DEGRADATION:
  *   Same as before — instance cached per process, errors fall back to Gemini.
@@ -44,6 +46,7 @@ import { GeminiAdapter } from "./adapters/gemini.js";
 import { OpenAIAdapter } from "./adapters/openai.js";
 import { AnthropicAdapter } from "./adapters/anthropic.js";
 import { VoyageAdapter } from "./adapters/voyage.js";
+import { OllamaAdapter } from "./adapters/ollama.js";
 import { TracingLLMProvider } from "./adapters/traced.js";
 // Module-level singleton — one composed provider per MCP server process.
 let providerInstance = null;
@@ -62,10 +65,12 @@ function buildEmbeddingAdapter(type) {
     // Note: "anthropic" is intentionally absent from this switch.
     // Anthropic has no embedding API, so it can never be an embedding provider.
     // The factory resolves "auto" away from "anthropic" before calling this.
-    // For Anthropic text users, "voyage" is the Anthropic-recommended pairing.
+    // For Anthropic text users, "voyage" is the recommended pairing;
+    // "ollama" is the fully local zero-cost alternative.
     switch (type) {
         case "openai": return new OpenAIAdapter();
         case "voyage": return new VoyageAdapter();
+        case "ollama": return new OllamaAdapter();
         case "gemini":
         default: return new GeminiAdapter();
     }
@@ -90,10 +95,15 @@ export function getLLMProvider() {
     let embedType = getSettingSync("embedding_provider", "auto");
     if (embedType === "auto") {
         if (process.env.VOYAGE_API_KEY) {
-            // If Voyage is available, use it as the default embedding provider
-            // since voyage-code-3 strongly outperforms general embeddings on code contexts.
+            // Voyage takes first priority when available — voyage-code-3 strongly
+            // outperforms general embeddings on code contexts.
             embedType = "voyage";
         }
+        else if (process.env.OLLAMA_HOST || process.env.OLLAMA_BASE_URL) {
+            // Ollama is second priority: fully local, zero-cost, zero-latency.
+            // Activated when OLLAMA_HOST or OLLAMA_BASE_URL env var is set.
+            embedType = "ollama";
+        }
         else {
             // Anthropic has no embedding API — auto-bridge to Gemini.
             // For all other text providers, use the same provider for embeddings.
@@ -101,9 +111,9 @@ export function getLLMProvider() {
             if (textType === "anthropic") {
                 console.error("[LLMFactory] text_provider=anthropic with embedding_provider=auto: " +
                     "routing embeddings to GeminiAdapter (Anthropic has no native embedding API). " +
-                    "For the Anthropic-recommended pairing, set embedding_provider=voyage in the dashboard " +
-                    "(voyage-code-3 supports 768-dim output via MRL). " +
-                    "Alternatively, set embedding_provider=openai to use Ollama/OpenAI.");
+                    "For the Anthropic-recommended pairing, set embedding_provider=voyage in the dashboard. " +
+                    "For a fully local, zero-cost option, set embedding_provider=ollama " +
+                    "(requires 'ollama pull nomic-embed-text').");
             }
         }
     }

package/dist/utils/llm/provider.js CHANGED Viewed

@@ -16,6 +16,8 @@
  *   - gemini.ts    → Google Gemini (default; all methods including VLM)
  *   - openai.ts    → OpenAI Cloud + Ollama + LM Studio + vLLM
  *   - anthropic.ts → Anthropic Claude (VLM supported; embeddings unsupported)
+ *   - voyage.ts    → Voyage AI (embeddings only; Anthropic-recommended pairing)
+ *   - ollama.ts    → Ollama native /api/embed (embeddings only; fully local, zero-cost)
  *
  * FACTORY RESOLUTION:
  *   Never instantiate adapters directly. Always call:

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "prism-mcp-server",
-  "version": "7.8.6",
+  "version": "7.8.8",
   "mcpName": "io.github.dcostenco/prism-mcp",
   "description": "The Mind Palace for AI Agents — a true Cognitive Architecture with Hebbian learning (episodic→semantic consolidation), ACT-R spreading activation (multi-hop causal reasoning), uncertainty-aware rejection gates (agents that know when they don't know), adversarial evaluation (anti-sycophancy), fail-closed Dark Factory pipelines, persistent memory (SQLite/Supabase), multi-agent Hivemind, time travel & visual dashboard. Zero-config local mode.",
   "module": "index.ts",