prism-mcp-server 7.8.6 โ 7.8.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +7 -4
- package/dist/dashboard/ui.js +72 -6
- package/dist/utils/llm/adapters/ollama.js +153 -0
- package/dist/utils/llm/factory.js +19 -9
- package/dist/utils/llm/provider.js +2 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -691,9 +691,9 @@ The Generator strips the `console.log`, resubmits, and the next `EVALUATE` retur
|
|
|
691
691
|
|
|
692
692
|
## ๐ What's New
|
|
693
693
|
|
|
694
|
-
> **Current release: v7.8.
|
|
694
|
+
> **Current release: v7.8.7 โ Cognitive Architecture**
|
|
695
695
|
|
|
696
|
-
- ๐ง **v7.8.
|
|
696
|
+
- ๐ง **v7.8.x โ Cognitive Architecture:** The biggest leap forward yet. Moved beyond flat vector search into a true cognitive architecture inspired by human brain mechanics. Episodic-to-Semantic memory consolidation (Hebbian learning), ACT-R Spreading Activation with multi-hop causal reasoning, Uncertainty-Aware Rejection Gate (your agent can say "I don't know"), and Dynamic Fast Weight Decay (semantic memories outlive episodic chatter by 2ร). Validated by **LoCoMo-Plus benchmark** (arXiv 2602.10715) with Precision@K and MRR metrics. **Your agents don't just remember; they learn.** โ [Cognitive Architecture](#-cognitive-architecture-v78)
|
|
697
697
|
- ๐ **v7.7.0 โ Cloud-Native SSE Transport:** Full unauthenticated and authenticated Server-Sent Events MCP support for seamless network deployments.
|
|
698
698
|
- ๐ฉบ **v7.5.0 โ Intent Health Dashboard + Security Hardening:** Real-time 0โ100 project health scoring (staleness ร TODO load ร decisions). 10 XSS injection vectors patched. Algorithm hardened with NaN guards and score ceiling.
|
|
699
699
|
- โ๏ธ **v7.4.0 โ Adversarial Evaluation:** Split-brain anti-sycophancy pipeline. Generator and evaluator in isolated roles with evidence-bound findings.
|
|
@@ -968,6 +968,7 @@ Prism is a **stdio-based MCP server** that manages persistent agent memory. Here
|
|
|
968
968
|
โ โ โข ACT-R Spreading Activation (multi-hop) โ โ
|
|
969
969
|
โ โ โข Episodic โ Semantic Consolidation (Hebbian) โ โ
|
|
970
970
|
โ โ โข Uncertainty-Aware Rejection Gate โ โ
|
|
971
|
+
โ โ โข LoCoMo-Plus Benchmark Validation โ โ
|
|
971
972
|
โ โ โข Dynamic Fast Weight Decay (dual-rate) โ โ
|
|
972
973
|
โ โ โข HDC Cognitive Routing (XOR binding) โ โ
|
|
973
974
|
โ โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
|
|
@@ -1056,16 +1057,18 @@ Prism has evolved from smart session logging into a **cognitive memory architect
|
|
|
1056
1057
|
| **v7.8** | Multi-Hop Causal Reasoning โ spreading activation traverses `caused_by`/`led_to` edges with damped fan effect (`1/ln(fan+e)`) and lateral inhibition | ACT-R spreading activation (Anderson), Collins & Loftus (1975) | โ
Shipped |
|
|
1057
1058
|
| **v7.8** | Uncertainty-Aware Rejection Gate โ dual-signal (similarity floor + gap distance) safety layer prevents hallucination from low-confidence retrievals | Metacognition research, uncertainty quantification | โ
Shipped |
|
|
1058
1059
|
| **v7.8** | Dynamic Fast Weight Decay โ `is_rollup` semantic nodes decay 50% slower (`ageModifier = 0.5`) than episodic entries, creating Long-Term Context anchors | ACT-R base-level activation with differential decay rates | โ
Shipped |
|
|
1060
|
+
| **v7.8** | LoCoMo Benchmark Harness โ deterministic integration suite (`tests/benchmarks/locomo.ts`, 20 assertions) benchmarking multi-hop compaction structures via `MockLLM` | Long-Context Memory evaluation (cognitive benchmarking) | โ
Shipped |
|
|
1061
|
+
| **v7.8** | LoCoMo-Plus Benchmark โ 16-assertion suite (`tests/benchmarks/locomo-plus.ts`) adapted from arXiv 2602.10715 validating cueโtrigger semantic disconnect bridging via graph traversal and Hebbian consolidation; reports Precision@1/3/5/10 and MRR | LoCoMo-Plus (Li et al., ARR 2026), cueโtrigger disconnect research | โ
Shipped |
|
|
1059
1062
|
| **v7.x** | Affect-Tagged Memory โ sentiment shapes what gets recalled | Affect-modulated retrieval (neuroscience) | ๐ญ Horizon |
|
|
1060
1063
|
| **v8+** | Zero-Search Retrieval โ no index, no ANN, just ask the vector | Holographic Reduced Representations | ๐ญ Horizon |
|
|
1061
1064
|
|
|
1062
|
-
> Informed by Anderson's ACT-R (Adaptive Control of ThoughtโRational), Collins & Loftus spreading activation networks (1975), Kanerva's SDM (1988), Hebb's learning rule, and LeCun's "Why AI Systems Don't Learn" (Dupoux, LeCun, Malik).
|
|
1065
|
+
> Informed by Anderson's ACT-R (Adaptive Control of ThoughtโRational), Collins & Loftus spreading activation networks (1975), Kanerva's SDM (1988), Hebb's learning rule, Li et al. LoCoMo-Plus (ARR 2026), and LeCun's "Why AI Systems Don't Learn" (Dupoux, LeCun, Malik).
|
|
1063
1066
|
|
|
1064
1067
|
---
|
|
1065
1068
|
|
|
1066
1069
|
## ๐ฆ Milestones & Roadmap
|
|
1067
1070
|
|
|
1068
|
-
> **Current: v7.8.
|
|
1071
|
+
> **Current: v7.8.7** โ Cognitive Architecture ([CHANGELOG](CHANGELOG.md))
|
|
1069
1072
|
|
|
1070
1073
|
| Release | Headline |
|
|
1071
1074
|
|---------|----------|
|
package/dist/dashboard/ui.js
CHANGED
|
@@ -1253,7 +1253,9 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
|
|
|
1253
1253
|
onchange="onEmbeddingProviderChange(this.value)">
|
|
1254
1254
|
<option value="auto">๐ Auto (same as Text Provider)</option>
|
|
1255
1255
|
<option value="gemini">๐ต Gemini</option>
|
|
1256
|
-
<option value="openai">๐ข OpenAI
|
|
1256
|
+
<option value="openai">๐ข OpenAI</option>
|
|
1257
|
+
<option value="voyage">๐ฎ Voyage AI</option>
|
|
1258
|
+
<option value="ollama">๐ Ollama (Local)</option>
|
|
1257
1259
|
</select>
|
|
1258
1260
|
</div>
|
|
1259
1261
|
|
|
@@ -1261,7 +1263,7 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
|
|
|
1261
1263
|
<div id="anthropic-embed-warning" style="display:none;margin-top:0.5rem;padding:0.5rem 0.75rem;background:rgba(251,146,60,0.1);border:1px solid rgba(251,146,60,0.3);border-radius:6px;font-size:0.78rem;color:#fb923c;line-height:1.5">
|
|
1262
1264
|
โ ๏ธ <strong>Anthropic has no native embedding API.</strong>
|
|
1263
1265
|
Auto mode will route embeddings to <strong>Gemini</strong>.
|
|
1264
|
-
Set Embedding Provider to <strong>
|
|
1266
|
+
Set Embedding Provider to <strong>Ollama (Local)</strong> for free local embeddings, or <strong>Voyage AI</strong> for the Anthropic-recommended cloud pairing.
|
|
1265
1267
|
</div>
|
|
1266
1268
|
|
|
1267
1269
|
<!-- OpenAI embedding model field (shown when embedding_provider = openai) -->
|
|
@@ -1269,7 +1271,7 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
|
|
|
1269
1271
|
<div class="setting-row">
|
|
1270
1272
|
<div>
|
|
1271
1273
|
<div class="setting-label">Embedding Model</div>
|
|
1272
|
-
<div class="setting-desc">Must output 768 dims.
|
|
1274
|
+
<div class="setting-desc">Must output 768 dims. Default: text-embedding-3-small</div>
|
|
1273
1275
|
</div>
|
|
1274
1276
|
<input type="text" id="input-openai-embedding-model"
|
|
1275
1277
|
placeholder="text-embedding-3-small"
|
|
@@ -1279,9 +1281,61 @@ Example:\n## Dev Rules\n- Always write tests first\n- Use TypeScript strict mode
|
|
|
1279
1281
|
</div>
|
|
1280
1282
|
</div>
|
|
1281
1283
|
|
|
1284
|
+
<!-- Voyage AI embedding fields (shown when embedding_provider = voyage) -->
|
|
1285
|
+
<div id="embed-fields-voyage" style="display:none">
|
|
1286
|
+
<div class="setting-row">
|
|
1287
|
+
<div>
|
|
1288
|
+
<div class="setting-label">Voyage API Key</div>
|
|
1289
|
+
<div class="setting-desc">Get one free at <a href="https://dash.voyageai.com" target="_blank" style="color:var(--accent)">dash.voyageai.com</a></div>
|
|
1290
|
+
</div>
|
|
1291
|
+
<input type="password" id="input-voyage-api-key"
|
|
1292
|
+
placeholder="pa-โฆ"
|
|
1293
|
+
style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 180px;"
|
|
1294
|
+
onchange="saveBootSetting('VOYAGE_API_KEY', this.value)"
|
|
1295
|
+
oninput="clearTimeout(this._pv); var self=this; this._pv=setTimeout(function(){saveBootSetting('VOYAGE_API_KEY',self.value)},800)" />
|
|
1296
|
+
</div>
|
|
1297
|
+
<div class="setting-row">
|
|
1298
|
+
<div>
|
|
1299
|
+
<div class="setting-label">Voyage Model</div>
|
|
1300
|
+
<div class="setting-desc">voyage-code-3 (code) ยท voyage-3 (general). Both MRL โ 768 dims.</div>
|
|
1301
|
+
</div>
|
|
1302
|
+
<input type="text" id="input-voyage-model"
|
|
1303
|
+
placeholder="voyage-code-3"
|
|
1304
|
+
style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 160px;"
|
|
1305
|
+
onchange="saveBootSetting('voyage_model', this.value)"
|
|
1306
|
+
oninput="clearTimeout(this._pvm); var self=this; this._pvm=setTimeout(function(){saveBootSetting('voyage_model',self.value)},800)" />
|
|
1307
|
+
</div>
|
|
1308
|
+
</div>
|
|
1309
|
+
|
|
1310
|
+
<!-- Ollama embedding fields (shown when embedding_provider = ollama) -->
|
|
1311
|
+
<div id="embed-fields-ollama" style="display:none">
|
|
1312
|
+
<div class="setting-row">
|
|
1313
|
+
<div>
|
|
1314
|
+
<div class="setting-label">Ollama Base URL</div>
|
|
1315
|
+
<div class="setting-desc">Where Ollama is running locally</div>
|
|
1316
|
+
</div>
|
|
1317
|
+
<input type="text" id="input-ollama-base-url"
|
|
1318
|
+
placeholder="http://localhost:11434"
|
|
1319
|
+
style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 220px;"
|
|
1320
|
+
onchange="saveBootSetting('ollama_base_url', this.value)"
|
|
1321
|
+
oninput="clearTimeout(this._pou); var self=this; this._pou=setTimeout(function(){saveBootSetting('ollama_base_url',self.value)},800)" />
|
|
1322
|
+
</div>
|
|
1323
|
+
<div class="setting-row">
|
|
1324
|
+
<div>
|
|
1325
|
+
<div class="setting-label">Embedding Model</div>
|
|
1326
|
+
<div class="setting-desc">Must output 768 dims. <code>nomic-embed-text</code> recommended.</div>
|
|
1327
|
+
</div>
|
|
1328
|
+
<input type="text" id="input-ollama-model"
|
|
1329
|
+
placeholder="nomic-embed-text"
|
|
1330
|
+
style="padding: 0.2rem 0.5rem; background: var(--bg-hover); color: var(--text-primary); border: 1px solid var(--border-color); border-radius: 4px; font-size: 0.85rem; font-family: var(--font-mono); width: 180px;"
|
|
1331
|
+
onchange="saveBootSetting('ollama_model', this.value)"
|
|
1332
|
+
oninput="clearTimeout(this._pom); var self=this; this._pom=setTimeout(function(){saveBootSetting('ollama_model',self.value)},800)" />
|
|
1333
|
+
</div>
|
|
1334
|
+
</div>
|
|
1335
|
+
|
|
1282
1336
|
<div style="margin-top:1rem;padding:0.6rem 0.8rem;background:rgba(139,92,246,0.08);border:1px solid rgba(139,92,246,0.2);border-radius:6px;font-size:0.78rem;color:var(--text-secondary);line-height:1.5">
|
|
1283
|
-
๐ก <strong>
|
|
1284
|
-
Use Claude
|
|
1337
|
+
๐ก <strong>Zero-cost setup:</strong> Text Provider โ <code>Anthropic</code>, Embedding Provider โ <code>Ollama (Local)</code>.<br>
|
|
1338
|
+
Use Claude for reasoning & <code>nomic-embed-text</code> (free, local, 768-dim native) for embeddings.
|
|
1285
1339
|
</div>
|
|
1286
1340
|
|
|
1287
1341
|
<span class="setting-saved" id="savedToastProviders">Saved โ</span>
|
|
@@ -3135,8 +3189,10 @@ function onTextProviderChange(value) {
|
|
|
3135
3189
|
// Called when the EMBEDDING provider dropdown changes.
|
|
3136
3190
|
function onEmbeddingProviderChange(value) {
|
|
3137
3191
|
var textVal = document.getElementById('select-text-provider').value;
|
|
3138
|
-
// Show
|
|
3192
|
+
// Show provider-specific fields based on the selected embedding provider
|
|
3139
3193
|
document.getElementById('embed-fields-openai').style.display = value === 'openai' ? '' : 'none';
|
|
3194
|
+
document.getElementById('embed-fields-voyage').style.display = value === 'voyage' ? '' : 'none';
|
|
3195
|
+
document.getElementById('embed-fields-ollama').style.display = value === 'ollama' ? '' : 'none';
|
|
3140
3196
|
refreshAnthropicWarning(textVal, value);
|
|
3141
3197
|
saveBootSetting('embedding_provider', value);
|
|
3142
3198
|
}
|
|
@@ -3174,7 +3230,17 @@ function loadAiProviderSettings() {
|
|
|
3174
3230
|
if (embedSel)
|
|
3175
3231
|
embedSel.value = embedProvider;
|
|
3176
3232
|
document.getElementById('embed-fields-openai').style.display = embedProvider === 'openai' ? '' : 'none';
|
|
3233
|
+
document.getElementById('embed-fields-voyage').style.display = embedProvider === 'voyage' ? '' : 'none';
|
|
3234
|
+
document.getElementById('embed-fields-ollama').style.display = embedProvider === 'ollama' ? '' : 'none';
|
|
3177
3235
|
refreshAnthropicWarning(textProvider, embedProvider);
|
|
3236
|
+
var vKey = document.getElementById('input-voyage-api-key');
|
|
3237
|
+
if (vKey) vKey.placeholder = s.VOYAGE_API_KEY ? '(key saved โ paste to update)' : 'pa-โฆ';
|
|
3238
|
+
var vMod = document.getElementById('input-voyage-model');
|
|
3239
|
+
if (vMod && s.voyage_model) vMod.value = s.voyage_model;
|
|
3240
|
+
var olUrl = document.getElementById('input-ollama-base-url');
|
|
3241
|
+
if (olUrl && s.ollama_base_url) olUrl.value = s.ollama_base_url;
|
|
3242
|
+
var olMod = document.getElementById('input-ollama-model');
|
|
3243
|
+
if (olMod && s.ollama_model) olMod.value = s.ollama_model;
|
|
3178
3244
|
gKey = document.getElementById('input-google-api-key');
|
|
3179
3245
|
if (gKey)
|
|
3180
3246
|
gKey.placeholder = s.GOOGLE_API_KEY ? '(key saved โ paste to update)' : 'AIzaโฆ';
|
|
@@ -0,0 +1,153 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Ollama Adapter (v1.0 โ nomic-embed-text)
|
|
3
|
+
* โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
4
|
+
* PURPOSE:
|
|
5
|
+
* Implements LLMProvider using Ollama's native /api/embed REST endpoint for
|
|
6
|
+
* fully local, zero-cost text embeddings. No API key required โ Ollama runs
|
|
7
|
+
* on localhost.
|
|
8
|
+
*
|
|
9
|
+
* TEXT GENERATION:
|
|
10
|
+
* This adapter is embeddings-only. generateText() throws an explicit error.
|
|
11
|
+
* Set text_provider separately (anthropic, openai, or gemini).
|
|
12
|
+
*
|
|
13
|
+
* EMBEDDING DIMENSION PARITY (768 dims):
|
|
14
|
+
* Prism's SQLite (sqlite-vec) and Supabase (pgvector) schemas define
|
|
15
|
+
* embedding columns as EXACTLY 768 dimensions.
|
|
16
|
+
*
|
|
17
|
+
* nomic-embed-text natively outputs 768 dims โ zero truncation needed.
|
|
18
|
+
* It is the recommended default local model for Prism.
|
|
19
|
+
*
|
|
20
|
+
* SUPPORTED MODELS (all confirmed 768-dim via Ollama):
|
|
21
|
+
* nomic-embed-text โ 768 dims, 274MB, best quality/size trade-off โ
DEFAULT
|
|
22
|
+
* nomic-embed-text:v1.5 โ 768 dims, 274MB, same (stable alias)
|
|
23
|
+
*
|
|
24
|
+
* Models to AVOID with this adapter (wrong dim count):
|
|
25
|
+
* mxbai-embed-large โ 1024 dims โ (use OpenAIAdapter instead)
|
|
26
|
+
* all-minilm โ 384 dims โ
|
|
27
|
+
* snowflake-arctic-embed โ varies โ
|
|
28
|
+
*
|
|
29
|
+
* BATCH EMBEDDINGS:
|
|
30
|
+
* Uses /api/embed (plural) which is the official Ollama batch endpoint
|
|
31
|
+
* introduced in Ollama โฅ 0.3.0. Falls back gracefully for older versions.
|
|
32
|
+
*
|
|
33
|
+
* CONFIG KEYS (Prism dashboard "AI Providers" tab OR environment variables):
|
|
34
|
+
* ollama_base_url โ Base URL of Ollama server (default: http://localhost:11434)
|
|
35
|
+
* ollama_model โ Embedding model (default: nomic-embed-text)
|
|
36
|
+
*
|
|
37
|
+
* USAGE:
|
|
38
|
+
* In the Prism dashboard, set:
|
|
39
|
+
* embedding_provider = ollama
|
|
40
|
+
* Optionally set ollama_base_url and ollama_model to override defaults.
|
|
41
|
+
*
|
|
42
|
+
* API REFERENCE:
|
|
43
|
+
* https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embeddings
|
|
44
|
+
*/
|
|
45
|
+
import { getSettingSync } from "../../../storage/configStorage.js";
|
|
46
|
+
import { debugLog } from "../../logger.js";
|
|
47
|
+
// โโโ Constants โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
48
|
+
// Must match Prism's DB schema (sqlite-vec and pgvector column sizes).
|
|
49
|
+
const EMBEDDING_DIMS = 768;
|
|
50
|
+
// Generous character cap โ nomic-embed-text has an 8192-token context window.
|
|
51
|
+
const MAX_EMBEDDING_CHARS = 8000;
|
|
52
|
+
const DEFAULT_BASE_URL = "http://localhost:11434";
|
|
53
|
+
const DEFAULT_MODEL = "nomic-embed-text";
|
|
54
|
+
// Connection retry settings โ handles the common "forgot to start Ollama" race.
|
|
55
|
+
const MAX_RETRIES = 2;
|
|
56
|
+
const RETRY_DELAY_MS = 500;
|
|
57
|
+
// โโโ Adapter โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
58
|
+
export class OllamaAdapter {
|
|
59
|
+
baseUrl;
|
|
60
|
+
model;
|
|
61
|
+
constructor() {
|
|
62
|
+
this.baseUrl = getSettingSync("ollama_base_url", DEFAULT_BASE_URL).replace(/\/$/, "");
|
|
63
|
+
this.model = getSettingSync("ollama_model", DEFAULT_MODEL);
|
|
64
|
+
debugLog(`[OllamaAdapter] Initialized โ baseUrl=${this.baseUrl}, model=${this.model}`);
|
|
65
|
+
}
|
|
66
|
+
// โโโ Text Generation (Not Supported) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
67
|
+
async generateText(_prompt, _systemInstruction) {
|
|
68
|
+
throw new Error("OllamaAdapter does not support text generation. " +
|
|
69
|
+
"Set text_provider to 'anthropic', 'openai', or 'gemini' in the dashboard.");
|
|
70
|
+
}
|
|
71
|
+
// โโโ Batch Embedding Generation โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
72
|
+
async generateEmbeddings(texts) {
|
|
73
|
+
if (!texts || texts.length === 0)
|
|
74
|
+
return [];
|
|
75
|
+
const model = this.model;
|
|
76
|
+
// Word-safe truncation โ consistent with Voyage and OpenAI adapters.
|
|
77
|
+
const truncatedTexts = texts.map(text => {
|
|
78
|
+
if (text.length > MAX_EMBEDDING_CHARS) {
|
|
79
|
+
const cut = text.slice(0, MAX_EMBEDDING_CHARS);
|
|
80
|
+
const lastSpace = cut.lastIndexOf(" ");
|
|
81
|
+
return lastSpace > 0 ? cut.slice(0, lastSpace) : cut;
|
|
82
|
+
}
|
|
83
|
+
return text;
|
|
84
|
+
});
|
|
85
|
+
debugLog(`[OllamaAdapter] generateEmbeddings โ model=${model}, count=${truncatedTexts.length}`);
|
|
86
|
+
// Retry loop โ catches ECONNREFUSED when Ollama service hasn't started yet.
|
|
87
|
+
let response;
|
|
88
|
+
let lastError = null;
|
|
89
|
+
for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
|
|
90
|
+
try {
|
|
91
|
+
response = await fetch(`${this.baseUrl}/api/embed`, {
|
|
92
|
+
method: "POST",
|
|
93
|
+
headers: { "Content-Type": "application/json" },
|
|
94
|
+
body: JSON.stringify({ model, input: truncatedTexts }),
|
|
95
|
+
});
|
|
96
|
+
if (!response.ok) {
|
|
97
|
+
const errorText = await response.text().catch(() => "unknown error");
|
|
98
|
+
throw new Error(`[OllamaAdapter] /api/embed request failed โ status=${response.status}: ${errorText}. ` +
|
|
99
|
+
`Make sure Ollama is running (ollama serve) and '${model}' has been pulled (ollama pull ${model}).`);
|
|
100
|
+
}
|
|
101
|
+
// Success โ break out of retry loop.
|
|
102
|
+
lastError = null;
|
|
103
|
+
break;
|
|
104
|
+
}
|
|
105
|
+
catch (err) {
|
|
106
|
+
lastError = err instanceof Error ? err : new Error(String(err));
|
|
107
|
+
const isNetworkError = lastError.message.includes("ECONNREFUSED") ||
|
|
108
|
+
lastError.message.includes("fetch failed") ||
|
|
109
|
+
lastError.message.includes("ECONNRESET");
|
|
110
|
+
if (isNetworkError && attempt < MAX_RETRIES) {
|
|
111
|
+
debugLog(`[OllamaAdapter] Connection failed (attempt ${attempt + 1}/${MAX_RETRIES + 1}): ` +
|
|
112
|
+
`${lastError.message.substring(0, 80)}. Retrying in ${RETRY_DELAY_MS}ms...`);
|
|
113
|
+
await new Promise(resolve => setTimeout(resolve, RETRY_DELAY_MS));
|
|
114
|
+
continue;
|
|
115
|
+
}
|
|
116
|
+
throw lastError;
|
|
117
|
+
}
|
|
118
|
+
}
|
|
119
|
+
if (lastError)
|
|
120
|
+
throw lastError;
|
|
121
|
+
const data = (await response.json());
|
|
122
|
+
const embeddings = data?.embeddings;
|
|
123
|
+
if (!Array.isArray(embeddings) || embeddings.length === 0) {
|
|
124
|
+
throw new Error(`[OllamaAdapter] Empty embeddings response from model '${model}'.`);
|
|
125
|
+
}
|
|
126
|
+
if (embeddings.length !== texts.length) {
|
|
127
|
+
throw new Error(`[OllamaAdapter] Response length mismatch โ expected ${texts.length}, got ${embeddings.length}.`);
|
|
128
|
+
}
|
|
129
|
+
// Validate dimensions and slice if model returned > 768 (shouldn't happen
|
|
130
|
+
// with nomic-embed-text but guards against model swaps).
|
|
131
|
+
return embeddings.map((emb, i) => {
|
|
132
|
+
if (emb.length > EMBEDDING_DIMS) {
|
|
133
|
+
debugLog(`[OllamaAdapter] Embedding[${i}] has ${emb.length} dims โ truncating to ${EMBEDDING_DIMS}. ` +
|
|
134
|
+
`Consider using a model that natively outputs ${EMBEDDING_DIMS} dims (e.g. nomic-embed-text).`);
|
|
135
|
+
return emb.slice(0, EMBEDDING_DIMS);
|
|
136
|
+
}
|
|
137
|
+
if (emb.length !== EMBEDDING_DIMS) {
|
|
138
|
+
throw new Error(`[OllamaAdapter] Dimension mismatch at index ${i}: expected ${EMBEDDING_DIMS}, ` +
|
|
139
|
+
`got ${emb.length}. Model '${model}' is not compatible with Prism's 768-dim schema. ` +
|
|
140
|
+
`Use nomic-embed-text which natively outputs 768 dims.`);
|
|
141
|
+
}
|
|
142
|
+
return emb;
|
|
143
|
+
});
|
|
144
|
+
}
|
|
145
|
+
// โโโ Single Embedding (delegates to batch) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
146
|
+
async generateEmbedding(text) {
|
|
147
|
+
if (!text || !text.trim()) {
|
|
148
|
+
throw new Error("[OllamaAdapter] generateEmbedding called with empty text.");
|
|
149
|
+
}
|
|
150
|
+
const results = await this.generateEmbeddings([text]);
|
|
151
|
+
return results[0];
|
|
152
|
+
}
|
|
153
|
+
}
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
/**
|
|
2
|
-
* LLM Provider Factory (v4.
|
|
2
|
+
* LLM Provider Factory (v4.6 โ Ollama Local Embedding Support)
|
|
3
3
|
* โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
4
4
|
* PURPOSE:
|
|
5
5
|
* Single point of resolution for the active LLMProvider.
|
|
@@ -11,7 +11,7 @@
|
|
|
11
11
|
* Two independent settings control text and embedding routing:
|
|
12
12
|
*
|
|
13
13
|
* text_provider โ "gemini" (default) | "openai" | "anthropic"
|
|
14
|
-
* embedding_provider โ "auto" (default) | "gemini" | "openai" | "voyage"
|
|
14
|
+
* embedding_provider โ "auto" (default) | "gemini" | "openai" | "voyage" | "ollama"
|
|
15
15
|
*
|
|
16
16
|
* When embedding_provider = "auto":
|
|
17
17
|
* * If text_provider is gemini or openai โ use same provider for embeddings
|
|
@@ -24,8 +24,10 @@
|
|
|
24
24
|
* text_provider=openai, embedding_provider=auto โ OpenAI+OpenAI
|
|
25
25
|
* text_provider=anthropic, embedding_provider=auto โ Claude+Gemini (auto-bridge)
|
|
26
26
|
* text_provider=anthropic, embedding_provider=voyage โ Claude+Voyage (Anthropic-recommended)
|
|
27
|
-
* text_provider=anthropic, embedding_provider=openai โ Claude+
|
|
27
|
+
* text_provider=anthropic, embedding_provider=openai โ Claude+OpenAI cloud embeddings
|
|
28
|
+
* text_provider=anthropic, embedding_provider=ollama โ Claude+Ollama (fully local, zero-cost)
|
|
28
29
|
* text_provider=gemini, embedding_provider=voyage โ Gemini+Voyage (mixed)
|
|
30
|
+
* text_provider=gemini, embedding_provider=ollama โ Gemini+Ollama (hybrid cloud/local)
|
|
29
31
|
*
|
|
30
32
|
* SINGLETON + GRACEFUL DEGRADATION:
|
|
31
33
|
* Same as before โ instance cached per process, errors fall back to Gemini.
|
|
@@ -44,6 +46,7 @@ import { GeminiAdapter } from "./adapters/gemini.js";
|
|
|
44
46
|
import { OpenAIAdapter } from "./adapters/openai.js";
|
|
45
47
|
import { AnthropicAdapter } from "./adapters/anthropic.js";
|
|
46
48
|
import { VoyageAdapter } from "./adapters/voyage.js";
|
|
49
|
+
import { OllamaAdapter } from "./adapters/ollama.js";
|
|
47
50
|
import { TracingLLMProvider } from "./adapters/traced.js";
|
|
48
51
|
// Module-level singleton โ one composed provider per MCP server process.
|
|
49
52
|
let providerInstance = null;
|
|
@@ -62,10 +65,12 @@ function buildEmbeddingAdapter(type) {
|
|
|
62
65
|
// Note: "anthropic" is intentionally absent from this switch.
|
|
63
66
|
// Anthropic has no embedding API, so it can never be an embedding provider.
|
|
64
67
|
// The factory resolves "auto" away from "anthropic" before calling this.
|
|
65
|
-
// For Anthropic text users, "voyage" is the
|
|
68
|
+
// For Anthropic text users, "voyage" is the recommended pairing;
|
|
69
|
+
// "ollama" is the fully local zero-cost alternative.
|
|
66
70
|
switch (type) {
|
|
67
71
|
case "openai": return new OpenAIAdapter();
|
|
68
72
|
case "voyage": return new VoyageAdapter();
|
|
73
|
+
case "ollama": return new OllamaAdapter();
|
|
69
74
|
case "gemini":
|
|
70
75
|
default: return new GeminiAdapter();
|
|
71
76
|
}
|
|
@@ -90,10 +95,15 @@ export function getLLMProvider() {
|
|
|
90
95
|
let embedType = getSettingSync("embedding_provider", "auto");
|
|
91
96
|
if (embedType === "auto") {
|
|
92
97
|
if (process.env.VOYAGE_API_KEY) {
|
|
93
|
-
//
|
|
94
|
-
//
|
|
98
|
+
// Voyage takes first priority when available โ voyage-code-3 strongly
|
|
99
|
+
// outperforms general embeddings on code contexts.
|
|
95
100
|
embedType = "voyage";
|
|
96
101
|
}
|
|
102
|
+
else if (process.env.OLLAMA_HOST || process.env.OLLAMA_BASE_URL) {
|
|
103
|
+
// Ollama is second priority: fully local, zero-cost, zero-latency.
|
|
104
|
+
// Activated when OLLAMA_HOST or OLLAMA_BASE_URL env var is set.
|
|
105
|
+
embedType = "ollama";
|
|
106
|
+
}
|
|
97
107
|
else {
|
|
98
108
|
// Anthropic has no embedding API โ auto-bridge to Gemini.
|
|
99
109
|
// For all other text providers, use the same provider for embeddings.
|
|
@@ -101,9 +111,9 @@ export function getLLMProvider() {
|
|
|
101
111
|
if (textType === "anthropic") {
|
|
102
112
|
console.error("[LLMFactory] text_provider=anthropic with embedding_provider=auto: " +
|
|
103
113
|
"routing embeddings to GeminiAdapter (Anthropic has no native embedding API). " +
|
|
104
|
-
"For the Anthropic-recommended pairing, set embedding_provider=voyage in the dashboard " +
|
|
105
|
-
"
|
|
106
|
-
"
|
|
114
|
+
"For the Anthropic-recommended pairing, set embedding_provider=voyage in the dashboard. " +
|
|
115
|
+
"For a fully local, zero-cost option, set embedding_provider=ollama " +
|
|
116
|
+
"(requires 'ollama pull nomic-embed-text').");
|
|
107
117
|
}
|
|
108
118
|
}
|
|
109
119
|
}
|
|
@@ -16,6 +16,8 @@
|
|
|
16
16
|
* - gemini.ts โ Google Gemini (default; all methods including VLM)
|
|
17
17
|
* - openai.ts โ OpenAI Cloud + Ollama + LM Studio + vLLM
|
|
18
18
|
* - anthropic.ts โ Anthropic Claude (VLM supported; embeddings unsupported)
|
|
19
|
+
* - voyage.ts โ Voyage AI (embeddings only; Anthropic-recommended pairing)
|
|
20
|
+
* - ollama.ts โ Ollama native /api/embed (embeddings only; fully local, zero-cost)
|
|
19
21
|
*
|
|
20
22
|
* FACTORY RESOLUTION:
|
|
21
23
|
* Never instantiate adapters directly. Always call:
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "prism-mcp-server",
|
|
3
|
-
"version": "7.8.
|
|
3
|
+
"version": "7.8.8",
|
|
4
4
|
"mcpName": "io.github.dcostenco/prism-mcp",
|
|
5
5
|
"description": "The Mind Palace for AI Agents โ a true Cognitive Architecture with Hebbian learning (episodicโsemantic consolidation), ACT-R spreading activation (multi-hop causal reasoning), uncertainty-aware rejection gates (agents that know when they don't know), adversarial evaluation (anti-sycophancy), fail-closed Dark Factory pipelines, persistent memory (SQLite/Supabase), multi-agent Hivemind, time travel & visual dashboard. Zero-config local mode.",
|
|
6
6
|
"module": "index.ts",
|