prism-mcp-server 7.8.6 โ†’ 7.8.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -691,9 +691,9 @@ The Generator strips the `console.log`, resubmits, and the next `EVALUATE` retur
691
691
 
692
692
  ## ๐Ÿ†• What's New
693
693
 
694
- > **Current release: v7.8.2 โ€” Cognitive Architecture**
694
+ > **Current release: v7.8.7 โ€” Cognitive Architecture**
695
695
 
696
- - ๐Ÿง  **v7.8.0 โ€” Cognitive Architecture:** The biggest leap forward yet. Moved beyond flat vector search into a true cognitive architecture inspired by human brain mechanics. Episodic-to-Semantic memory consolidation (Hebbian learning), ACT-R Spreading Activation with multi-hop causal reasoning, Uncertainty-Aware Rejection Gate (your agent can say "I don't know"), and Dynamic Fast Weight Decay (semantic memories outlive episodic chatter by 2ร—). **Your agents don't just remember; they learn.** โ†’ [Cognitive Architecture](#-cognitive-architecture-v78)
696
+ - ๐Ÿง  **v7.8.x โ€” Cognitive Architecture:** The biggest leap forward yet. Moved beyond flat vector search into a true cognitive architecture inspired by human brain mechanics. Episodic-to-Semantic memory consolidation (Hebbian learning), ACT-R Spreading Activation with multi-hop causal reasoning, Uncertainty-Aware Rejection Gate (your agent can say "I don't know"), and Dynamic Fast Weight Decay (semantic memories outlive episodic chatter by 2ร—). Validated by **LoCoMo-Plus benchmark** (arXiv 2602.10715) with Precision@K and MRR metrics. **Your agents don't just remember; they learn.** โ†’ [Cognitive Architecture](#-cognitive-architecture-v78)
697
697
  - ๐ŸŒ **v7.7.0 โ€” Cloud-Native SSE Transport:** Full unauthenticated and authenticated Server-Sent Events MCP support for seamless network deployments.
698
698
  - ๐Ÿฉบ **v7.5.0 โ€” Intent Health Dashboard + Security Hardening:** Real-time 0โ€“100 project health scoring (staleness ร— TODO load ร— decisions). 10 XSS injection vectors patched. Algorithm hardened with NaN guards and score ceiling.
699
699
  - โš”๏ธ **v7.4.0 โ€” Adversarial Evaluation:** Split-brain anti-sycophancy pipeline. Generator and evaluator in isolated roles with evidence-bound findings.
@@ -968,6 +968,7 @@ Prism is a **stdio-based MCP server** that manages persistent agent memory. Here
968
968
  โ”‚ โ”‚ โ€ข ACT-R Spreading Activation (multi-hop) โ”‚ โ”‚
969
969
  โ”‚ โ”‚ โ€ข Episodic โ†’ Semantic Consolidation (Hebbian) โ”‚ โ”‚
970
970
  โ”‚ โ”‚ โ€ข Uncertainty-Aware Rejection Gate โ”‚ โ”‚
971
+ โ”‚ โ”‚ โ€ข LoCoMo-Plus Benchmark Validation โ”‚ โ”‚
971
972
  โ”‚ โ”‚ โ€ข Dynamic Fast Weight Decay (dual-rate) โ”‚ โ”‚
972
973
  โ”‚ โ”‚ โ€ข HDC Cognitive Routing (XOR binding) โ”‚ โ”‚
973
974
  โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
@@ -1056,16 +1057,18 @@ Prism has evolved from smart session logging into a **cognitive memory architect
1056
1057
  | **v7.8** | Multi-Hop Causal Reasoning โ€” spreading activation traverses `caused_by`/`led_to` edges with damped fan effect (`1/ln(fan+e)`) and lateral inhibition | ACT-R spreading activation (Anderson), Collins & Loftus (1975) | โœ… Shipped |
1057
1058
  | **v7.8** | Uncertainty-Aware Rejection Gate โ€” dual-signal (similarity floor + gap distance) safety layer prevents hallucination from low-confidence retrievals | Metacognition research, uncertainty quantification | โœ… Shipped |
1058
1059
  | **v7.8** | Dynamic Fast Weight Decay โ€” `is_rollup` semantic nodes decay 50% slower (`ageModifier = 0.5`) than episodic entries, creating Long-Term Context anchors | ACT-R base-level activation with differential decay rates | โœ… Shipped |
1060
+ | **v7.8** | LoCoMo Benchmark Harness โ€” deterministic integration suite (`tests/benchmarks/locomo.ts`, 20 assertions) benchmarking multi-hop compaction structures via `MockLLM` | Long-Context Memory evaluation (cognitive benchmarking) | โœ… Shipped |
1061
+ | **v7.8** | LoCoMo-Plus Benchmark โ€” 16-assertion suite (`tests/benchmarks/locomo-plus.ts`) adapted from arXiv 2602.10715 validating cueโ€“trigger semantic disconnect bridging via graph traversal and Hebbian consolidation; reports Precision@1/3/5/10 and MRR | LoCoMo-Plus (Li et al., ARR 2026), cueโ€“trigger disconnect research | โœ… Shipped |
1059
1062
  | **v7.x** | Affect-Tagged Memory โ€” sentiment shapes what gets recalled | Affect-modulated retrieval (neuroscience) | ๐Ÿ”ญ Horizon |
1060
1063
  | **v8+** | Zero-Search Retrieval โ€” no index, no ANN, just ask the vector | Holographic Reduced Representations | ๐Ÿ”ญ Horizon |
1061
1064
 
1062
- > Informed by Anderson's ACT-R (Adaptive Control of Thoughtโ€”Rational), Collins & Loftus spreading activation networks (1975), Kanerva's SDM (1988), Hebb's learning rule, and LeCun's "Why AI Systems Don't Learn" (Dupoux, LeCun, Malik).
1065
+ > Informed by Anderson's ACT-R (Adaptive Control of Thoughtโ€”Rational), Collins & Loftus spreading activation networks (1975), Kanerva's SDM (1988), Hebb's learning rule, Li et al. LoCoMo-Plus (ARR 2026), and LeCun's "Why AI Systems Don't Learn" (Dupoux, LeCun, Malik).
1063
1066
 
1064
1067
  ---
1065
1068
 
1066
1069
  ## ๐Ÿ“ฆ Milestones & Roadmap
1067
1070
 
1068
- > **Current: v7.8.2** โ€” Cognitive Architecture ([CHANGELOG](CHANGELOG.md))
1071
+ > **Current: v7.8.7** โ€” Cognitive Architecture ([CHANGELOG](CHANGELOG.md))
1069
1072
 
1070
1073
  | Release | Headline |
1071
1074
  |---------|----------|
@@ -1253,7 +1253,9 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
1253
1253
  onchange="onEmbeddingProviderChange(this.value)">
1254
1254
  <option value="auto">๐Ÿ”„ Auto (same as Text Provider)</option>
1255
1255
  <option value="gemini">๐Ÿ”ต Gemini</option>
1256
- <option value="openai">๐ŸŸข OpenAI / Ollama</option>
1256
+ <option value="openai">๐ŸŸข OpenAI</option>
1257
+ <option value="voyage">๐Ÿ”ฎ Voyage AI</option>
1258
+ <option value="ollama">๐ŸŸ  Ollama (Local)</option>
1257
1259
  </select>
1258
1260
  </div>
1259
1261
 
@@ -1261,7 +1263,7 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
1261
1263
  <div id="anthropic-embed-warning" style="display:none;margin-top:0.5rem;padding:0.5rem 0.75rem;background:rgba(251,146,60,0.1);border:1px solid rgba(251,146,60,0.3);border-radius:6px;font-size:0.78rem;color:#fb923c;line-height:1.5">
1262
1264
  โš ๏ธ <strong>Anthropic has no native embedding API.</strong>
1263
1265
  Auto mode will route embeddings to <strong>Gemini</strong>.
1264
- Set Embedding Provider to <strong>OpenAI / Ollama</strong> to use a local model (e.g. <code>nomic-embed-text</code>).
1266
+ Set Embedding Provider to <strong>Ollama (Local)</strong> for free local embeddings, or <strong>Voyage AI</strong> for the Anthropic-recommended cloud pairing.
1265
1267
  </div>
1266
1268
 
1267
1269
  <!-- OpenAI embedding model field (shown when embedding_provider = openai) -->
@@ -1269,7 +1271,7 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
1269
1271
  <div class="setting-row">
1270
1272
  <div>
1271
1273
  <div class="setting-label">Embedding Model</div>
1272
- <div class="setting-desc">Must output 768 dims. Ollama: nomic-embed-text ยท OpenAI: text-embedding-3-small</div>
1274
+ <div class="setting-desc">Must output 768 dims. Default: text-embedding-3-small</div>
1273
1275
  </div>
1274
1276
  <input type="text" id="input-openai-embedding-model"
1275
1277
  placeholder="text-embedding-3-small"
@@ -1279,9 +1281,61 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
1279
1281
  </div>
1280
1282
  </div>
1281
1283
 
1284
+ <!-- Voyage AI embedding fields (shown when embedding_provider = voyage) -->
1285
+ <div id="embed-fields-voyage" style="display:none">
1286
+ <div class="setting-row">
1287
+ <div>
1288
+ <div class="setting-label">Voyage API Key</div>
1289
+ <div class="setting-desc">Get one free at <a href="https://dash.voyageai.com" target="_blank" style="color:var(--accent)">dash.voyageai.com</a></div>
1290
+ </div>
1291
+ <input type="password" id="input-voyage-api-key"
1292
+ placeholder="pa-โ€ฆ"
1293
+ style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 180px;"
1294
+ onchange="saveBootSetting('VOYAGE_API_KEY', this.value)"
1295
+ oninput="clearTimeout(this._pv); var self=this; this._pv=setTimeout(function(){saveBootSetting('VOYAGE_API_KEY',self.value)},800)" />
1296
+ </div>
1297
+ <div class="setting-row">
1298
+ <div>
1299
+ <div class="setting-label">Voyage Model</div>
1300
+ <div class="setting-desc">voyage-code-3 (code) ยท voyage-3 (general). Both MRL โ†’ 768 dims.</div>
1301
+ </div>
1302
+ <input type="text" id="input-voyage-model"
1303
+ placeholder="voyage-code-3"
1304
+ style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 160px;"
1305
+ onchange="saveBootSetting('voyage_model', this.value)"
1306
+ oninput="clearTimeout(this._pvm); var self=this; this._pvm=setTimeout(function(){saveBootSetting('voyage_model',self.value)},800)" />
1307
+ </div>
1308
+ </div>
1309
+
1310
+ <!-- Ollama embedding fields (shown when embedding_provider = ollama) -->
1311
+ <div id="embed-fields-ollama" style="display:none">
1312
+ <div class="setting-row">
1313
+ <div>
1314
+ <div class="setting-label">Ollama Base URL</div>
1315
+ <div class="setting-desc">Where Ollama is running locally</div>
1316
+ </div>
1317
+ <input type="text" id="input-ollama-base-url"
1318
+ placeholder="http://localhost:11434"
1319
+ style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 220px;"
1320
+ onchange="saveBootSetting('ollama_base_url', this.value)"
1321
+ oninput="clearTimeout(this._pou); var self=this; this._pou=setTimeout(function(){saveBootSetting('ollama_base_url',self.value)},800)" />
1322
+ </div>
1323
+ <div class="setting-row">
1324
+ <div>
1325
+ <div class="setting-label">Embedding Model</div>
1326
+ <div class="setting-desc">Must output 768 dims. <code>nomic-embed-text</code> recommended.</div>
1327
+ </div>
1328
+ <input type="text" id="input-ollama-model"
1329
+ placeholder="nomic-embed-text"
1330
+ style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 180px;"
1331
+ onchange="saveBootSetting('ollama_model', this.value)"
1332
+ oninput="clearTimeout(this._pom); var self=this; this._pom=setTimeout(function(){saveBootSetting('ollama_model',self.value)},800)" />
1333
+ </div>
1334
+ </div>
1335
+
1282
1336
  <div style="margin-top:1rem;padding:0.6rem 0.8rem;background:rgba(139,92,246,0.08);border:1px solid rgba(139,92,246,0.2);border-radius:6px;font-size:0.78rem;color:var(--text-secondary);line-height:1.5">
1283
- ๐Ÿ’ก <strong>Cost-optimized setup:</strong> Text Provider โ†’ <code>Anthropic</code>, Embedding Provider โ†’ <code>OpenAI / Ollama</code>.<br>
1284
- Use Claude 3.5 Sonnet for reasoning &amp; <code>nomic-embed-text</code> (free, local) for embeddings.
1337
+ ๐Ÿ’ก <strong>Zero-cost setup:</strong> Text Provider โ†’ <code>Anthropic</code>, Embedding Provider โ†’ <code>Ollama (Local)</code>.<br>
1338
+ Use Claude for reasoning &amp; <code>nomic-embed-text</code> (free, local, 768-dim native) for embeddings.
1285
1339
  </div>
1286
1340
 
1287
1341
  <span class="setting-saved" id="savedToastProviders">Saved โœ“</span>
@@ -3135,8 +3189,10 @@ function onTextProviderChange(value) {
3135
3189
  // Called when the EMBEDDING provider dropdown changes.
3136
3190
  function onEmbeddingProviderChange(value) {
3137
3191
  var textVal = document.getElementById('select-text-provider').value;
3138
- // Show the OpenAI embedding model field only when embedding=openai
3192
+ // Show provider-specific fields based on the selected embedding provider
3139
3193
  document.getElementById('embed-fields-openai').style.display = value === 'openai' ? '' : 'none';
3194
+ document.getElementById('embed-fields-voyage').style.display = value === 'voyage' ? '' : 'none';
3195
+ document.getElementById('embed-fields-ollama').style.display = value === 'ollama' ? '' : 'none';
3140
3196
  refreshAnthropicWarning(textVal, value);
3141
3197
  saveBootSetting('embedding_provider', value);
3142
3198
  }
@@ -3174,7 +3230,17 @@ function loadAiProviderSettings() {
3174
3230
  if (embedSel)
3175
3231
  embedSel.value = embedProvider;
3176
3232
  document.getElementById('embed-fields-openai').style.display = embedProvider === 'openai' ? '' : 'none';
3233
+ document.getElementById('embed-fields-voyage').style.display = embedProvider === 'voyage' ? '' : 'none';
3234
+ document.getElementById('embed-fields-ollama').style.display = embedProvider === 'ollama' ? '' : 'none';
3177
3235
  refreshAnthropicWarning(textProvider, embedProvider);
3236
+ var vKey = document.getElementById('input-voyage-api-key');
3237
+ if (vKey) vKey.placeholder = s.VOYAGE_API_KEY ? '(key saved โ€” paste to update)' : 'pa-โ€ฆ';
3238
+ var vMod = document.getElementById('input-voyage-model');
3239
+ if (vMod && s.voyage_model) vMod.value = s.voyage_model;
3240
+ var olUrl = document.getElementById('input-ollama-base-url');
3241
+ if (olUrl && s.ollama_base_url) olUrl.value = s.ollama_base_url;
3242
+ var olMod = document.getElementById('input-ollama-model');
3243
+ if (olMod && s.ollama_model) olMod.value = s.ollama_model;
3178
3244
  gKey = document.getElementById('input-google-api-key');
3179
3245
  if (gKey)
3180
3246
  gKey.placeholder = s.GOOGLE_API_KEY ? '(key saved โ€” paste to update)' : 'AIzaโ€ฆ';
@@ -0,0 +1,153 @@
1
+ /**
2
+ * Ollama Adapter (v1.0 โ€” nomic-embed-text)
3
+ * โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
4
+ * PURPOSE:
5
+ * Implements LLMProvider using Ollama's native /api/embed REST endpoint for
6
+ * fully local, zero-cost text embeddings. No API key required โ€” Ollama runs
7
+ * on localhost.
8
+ *
9
+ * TEXT GENERATION:
10
+ * This adapter is embeddings-only. generateText() throws an explicit error.
11
+ * Set text_provider separately (anthropic, openai, or gemini).
12
+ *
13
+ * EMBEDDING DIMENSION PARITY (768 dims):
14
+ * Prism's SQLite (sqlite-vec) and Supabase (pgvector) schemas define
15
+ * embedding columns as EXACTLY 768 dimensions.
16
+ *
17
+ * nomic-embed-text natively outputs 768 dims โ€” zero truncation needed.
18
+ * It is the recommended default local model for Prism.
19
+ *
20
+ * SUPPORTED MODELS (all confirmed 768-dim via Ollama):
21
+ * nomic-embed-text โ€” 768 dims, 274MB, best quality/size trade-off โœ… DEFAULT
22
+ * nomic-embed-text:v1.5 โ€” 768 dims, 274MB, same (stable alias)
23
+ *
24
+ * Models to AVOID with this adapter (wrong dim count):
25
+ * mxbai-embed-large โ€” 1024 dims โŒ (use OpenAIAdapter instead)
26
+ * all-minilm โ€” 384 dims โŒ
27
+ * snowflake-arctic-embed โ€” varies โŒ
28
+ *
29
+ * BATCH EMBEDDINGS:
30
+ * Uses /api/embed (plural) which is the official Ollama batch endpoint
31
+ * introduced in Ollama โ‰ฅ 0.3.0. Falls back gracefully for older versions.
32
+ *
33
+ * CONFIG KEYS (Prism dashboard "AI Providers" tab OR environment variables):
34
+ * ollama_base_url โ€” Base URL of Ollama server (default: http://localhost:11434)
35
+ * ollama_model โ€” Embedding model (default: nomic-embed-text)
36
+ *
37
+ * USAGE:
38
+ * In the Prism dashboard, set:
39
+ * embedding_provider = ollama
40
+ * Optionally set ollama_base_url and ollama_model to override defaults.
41
+ *
42
+ * API REFERENCE:
43
+ * https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embeddings
44
+ */
45
+ import { getSettingSync } from "../../../storage/configStorage.js";
46
+ import { debugLog } from "../../logger.js";
47
+ // โ”€โ”€โ”€ Constants โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
48
+ // Must match Prism's DB schema (sqlite-vec and pgvector column sizes).
49
+ const EMBEDDING_DIMS = 768;
50
+ // Generous character cap โ€” nomic-embed-text has an 8192-token context window.
51
+ const MAX_EMBEDDING_CHARS = 8000;
52
+ const DEFAULT_BASE_URL = "http://localhost:11434";
53
+ const DEFAULT_MODEL = "nomic-embed-text";
54
+ // Connection retry settings โ€” handles the common "forgot to start Ollama" race.
55
+ const MAX_RETRIES = 2;
56
+ const RETRY_DELAY_MS = 500;
57
+ // โ”€โ”€โ”€ Adapter โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
58
+ export class OllamaAdapter {
59
+ baseUrl;
60
+ model;
61
+ constructor() {
62
+ this.baseUrl = getSettingSync("ollama_base_url", DEFAULT_BASE_URL).replace(/\/$/, "");
63
+ this.model = getSettingSync("ollama_model", DEFAULT_MODEL);
64
+ debugLog(`[OllamaAdapter] Initialized โ€” baseUrl=${this.baseUrl}, model=${this.model}`);
65
+ }
66
+ // โ”€โ”€โ”€ Text Generation (Not Supported) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
67
+ async generateText(_prompt, _systemInstruction) {
68
+ throw new Error("OllamaAdapter does not support text generation. " +
69
+ "Set text_provider to 'anthropic', 'openai', or 'gemini' in the dashboard.");
70
+ }
71
+ // โ”€โ”€โ”€ Batch Embedding Generation โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
72
+ async generateEmbeddings(texts) {
73
+ if (!texts || texts.length === 0)
74
+ return [];
75
+ const model = this.model;
76
+ // Word-safe truncation โ€” consistent with Voyage and OpenAI adapters.
77
+ const truncatedTexts = texts.map(text => {
78
+ if (text.length > MAX_EMBEDDING_CHARS) {
79
+ const cut = text.slice(0, MAX_EMBEDDING_CHARS);
80
+ const lastSpace = cut.lastIndexOf(" ");
81
+ return lastSpace > 0 ? cut.slice(0, lastSpace) : cut;
82
+ }
83
+ return text;
84
+ });
85
+ debugLog(`[OllamaAdapter] generateEmbeddings โ€” model=${model}, count=${truncatedTexts.length}`);
86
+ // Retry loop โ€” catches ECONNREFUSED when Ollama service hasn't started yet.
87
+ let response;
88
+ let lastError = null;
89
+ for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
90
+ try {
91
+ response = await fetch(`${this.baseUrl}/api/embed`, {
92
+ method: "POST",
93
+ headers: { "Content-Type": "application/json" },
94
+ body: JSON.stringify({ model, input: truncatedTexts }),
95
+ });
96
+ if (!response.ok) {
97
+ const errorText = await response.text().catch(() => "unknown error");
98
+ throw new Error(`[OllamaAdapter] /api/embed request failed โ€” status=${response.status}: ${errorText}. ` +
99
+ `Make sure Ollama is running (ollama serve) and '${model}' has been pulled (ollama pull ${model}).`);
100
+ }
101
+ // Success โ€” break out of retry loop.
102
+ lastError = null;
103
+ break;
104
+ }
105
+ catch (err) {
106
+ lastError = err instanceof Error ? err : new Error(String(err));
107
+ const isNetworkError = lastError.message.includes("ECONNREFUSED") ||
108
+ lastError.message.includes("fetch failed") ||
109
+ lastError.message.includes("ECONNRESET");
110
+ if (isNetworkError && attempt < MAX_RETRIES) {
111
+ debugLog(`[OllamaAdapter] Connection failed (attempt ${attempt + 1}/${MAX_RETRIES + 1}): ` +
112
+ `${lastError.message.substring(0, 80)}. Retrying in ${RETRY_DELAY_MS}ms...`);
113
+ await new Promise(resolve => setTimeout(resolve, RETRY_DELAY_MS));
114
+ continue;
115
+ }
116
+ throw lastError;
117
+ }
118
+ }
119
+ if (lastError)
120
+ throw lastError;
121
+ const data = (await response.json());
122
+ const embeddings = data?.embeddings;
123
+ if (!Array.isArray(embeddings) || embeddings.length === 0) {
124
+ throw new Error(`[OllamaAdapter] Empty embeddings response from model '${model}'.`);
125
+ }
126
+ if (embeddings.length !== texts.length) {
127
+ throw new Error(`[OllamaAdapter] Response length mismatch โ€” expected ${texts.length}, got ${embeddings.length}.`);
128
+ }
129
+ // Validate dimensions and slice if model returned > 768 (shouldn't happen
130
+ // with nomic-embed-text but guards against model swaps).
131
+ return embeddings.map((emb, i) => {
132
+ if (emb.length > EMBEDDING_DIMS) {
133
+ debugLog(`[OllamaAdapter] Embedding[${i}] has ${emb.length} dims โ€” truncating to ${EMBEDDING_DIMS}. ` +
134
+ `Consider using a model that natively outputs ${EMBEDDING_DIMS} dims (e.g. nomic-embed-text).`);
135
+ return emb.slice(0, EMBEDDING_DIMS);
136
+ }
137
+ if (emb.length !== EMBEDDING_DIMS) {
138
+ throw new Error(`[OllamaAdapter] Dimension mismatch at index ${i}: expected ${EMBEDDING_DIMS}, ` +
139
+ `got ${emb.length}. Model '${model}' is not compatible with Prism's 768-dim schema. ` +
140
+ `Use nomic-embed-text which natively outputs 768 dims.`);
141
+ }
142
+ return emb;
143
+ });
144
+ }
145
+ // โ”€โ”€โ”€ Single Embedding (delegates to batch) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
146
+ async generateEmbedding(text) {
147
+ if (!text || !text.trim()) {
148
+ throw new Error("[OllamaAdapter] generateEmbedding called with empty text.");
149
+ }
150
+ const results = await this.generateEmbeddings([text]);
151
+ return results[0];
152
+ }
153
+ }
@@ -1,5 +1,5 @@
1
1
  /**
2
- * LLM Provider Factory (v4.5 โ€” Voyage AI Embedding Support)
2
+ * LLM Provider Factory (v4.6 โ€” Ollama Local Embedding Support)
3
3
  * โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
4
4
  * PURPOSE:
5
5
  * Single point of resolution for the active LLMProvider.
@@ -11,7 +11,7 @@
11
11
  * Two independent settings control text and embedding routing:
12
12
  *
13
13
  * text_provider โ€” "gemini" (default) | "openai" | "anthropic"
14
- * embedding_provider โ€” "auto" (default) | "gemini" | "openai" | "voyage"
14
+ * embedding_provider โ€” "auto" (default) | "gemini" | "openai" | "voyage" | "ollama"
15
15
  *
16
16
  * When embedding_provider = "auto":
17
17
  * * If text_provider is gemini or openai โ†’ use same provider for embeddings
@@ -24,8 +24,10 @@
24
24
  * text_provider=openai, embedding_provider=auto โ†’ OpenAI+OpenAI
25
25
  * text_provider=anthropic, embedding_provider=auto โ†’ Claude+Gemini (auto-bridge)
26
26
  * text_provider=anthropic, embedding_provider=voyage โ†’ Claude+Voyage (Anthropic-recommended)
27
- * text_provider=anthropic, embedding_provider=openai โ†’ Claude+Ollama (cost-optimized)
27
+ * text_provider=anthropic, embedding_provider=openai โ†’ Claude+OpenAI cloud embeddings
28
+ * text_provider=anthropic, embedding_provider=ollama โ†’ Claude+Ollama (fully local, zero-cost)
28
29
  * text_provider=gemini, embedding_provider=voyage โ†’ Gemini+Voyage (mixed)
30
+ * text_provider=gemini, embedding_provider=ollama โ†’ Gemini+Ollama (hybrid cloud/local)
29
31
  *
30
32
  * SINGLETON + GRACEFUL DEGRADATION:
31
33
  * Same as before โ€” instance cached per process, errors fall back to Gemini.
@@ -44,6 +46,7 @@ import { GeminiAdapter } from "./adapters/gemini.js";
44
46
  import { OpenAIAdapter } from "./adapters/openai.js";
45
47
  import { AnthropicAdapter } from "./adapters/anthropic.js";
46
48
  import { VoyageAdapter } from "./adapters/voyage.js";
49
+ import { OllamaAdapter } from "./adapters/ollama.js";
47
50
  import { TracingLLMProvider } from "./adapters/traced.js";
48
51
  // Module-level singleton โ€” one composed provider per MCP server process.
49
52
  let providerInstance = null;
@@ -62,10 +65,12 @@ function buildEmbeddingAdapter(type) {
62
65
  // Note: "anthropic" is intentionally absent from this switch.
63
66
  // Anthropic has no embedding API, so it can never be an embedding provider.
64
67
  // The factory resolves "auto" away from "anthropic" before calling this.
65
- // For Anthropic text users, "voyage" is the Anthropic-recommended pairing.
68
+ // For Anthropic text users, "voyage" is the recommended pairing;
69
+ // "ollama" is the fully local zero-cost alternative.
66
70
  switch (type) {
67
71
  case "openai": return new OpenAIAdapter();
68
72
  case "voyage": return new VoyageAdapter();
73
+ case "ollama": return new OllamaAdapter();
69
74
  case "gemini":
70
75
  default: return new GeminiAdapter();
71
76
  }
@@ -90,10 +95,15 @@ export function getLLMProvider() {
90
95
  let embedType = getSettingSync("embedding_provider", "auto");
91
96
  if (embedType === "auto") {
92
97
  if (process.env.VOYAGE_API_KEY) {
93
- // If Voyage is available, use it as the default embedding provider
94
- // since voyage-code-3 strongly outperforms general embeddings on code contexts.
98
+ // Voyage takes first priority when available โ€” voyage-code-3 strongly
99
+ // outperforms general embeddings on code contexts.
95
100
  embedType = "voyage";
96
101
  }
102
+ else if (process.env.OLLAMA_HOST || process.env.OLLAMA_BASE_URL) {
103
+ // Ollama is second priority: fully local, zero-cost, zero-latency.
104
+ // Activated when OLLAMA_HOST or OLLAMA_BASE_URL env var is set.
105
+ embedType = "ollama";
106
+ }
97
107
  else {
98
108
  // Anthropic has no embedding API โ€” auto-bridge to Gemini.
99
109
  // For all other text providers, use the same provider for embeddings.
@@ -101,9 +111,9 @@ export function getLLMProvider() {
101
111
  if (textType === "anthropic") {
102
112
  console.error("[LLMFactory] text_provider=anthropic with embedding_provider=auto: " +
103
113
  "routing embeddings to GeminiAdapter (Anthropic has no native embedding API). " +
104
- "For the Anthropic-recommended pairing, set embedding_provider=voyage in the dashboard " +
105
- "(voyage-code-3 supports 768-dim output via MRL). " +
106
- "Alternatively, set embedding_provider=openai to use Ollama/OpenAI.");
114
+ "For the Anthropic-recommended pairing, set embedding_provider=voyage in the dashboard. " +
115
+ "For a fully local, zero-cost option, set embedding_provider=ollama " +
116
+ "(requires 'ollama pull nomic-embed-text').");
107
117
  }
108
118
  }
109
119
  }
@@ -16,6 +16,8 @@
16
16
  * - gemini.ts โ†’ Google Gemini (default; all methods including VLM)
17
17
  * - openai.ts โ†’ OpenAI Cloud + Ollama + LM Studio + vLLM
18
18
  * - anthropic.ts โ†’ Anthropic Claude (VLM supported; embeddings unsupported)
19
+ * - voyage.ts โ†’ Voyage AI (embeddings only; Anthropic-recommended pairing)
20
+ * - ollama.ts โ†’ Ollama native /api/embed (embeddings only; fully local, zero-cost)
19
21
  *
20
22
  * FACTORY RESOLUTION:
21
23
  * Never instantiate adapters directly. Always call:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "prism-mcp-server",
3
- "version": "7.8.6",
3
+ "version": "7.8.8",
4
4
  "mcpName": "io.github.dcostenco/prism-mcp",
5
5
  "description": "The Mind Palace for AI Agents โ€” a true Cognitive Architecture with Hebbian learning (episodicโ†’semantic consolidation), ACT-R spreading activation (multi-hop causal reasoning), uncertainty-aware rejection gates (agents that know when they don't know), adversarial evaluation (anti-sycophancy), fail-closed Dark Factory pipelines, persistent memory (SQLite/Supabase), multi-agent Hivemind, time travel & visual dashboard. Zero-config local mode.",
6
6
  "module": "index.ts",