prism-mcp-server 15.7.4 ā 16.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +47 -6
- package/dist/aba-protocol.js +2 -2
- package/dist/hivemindWatchdog.js +1 -1
- package/dist/storage/sqlite.js +27 -0
- package/dist/storage/supabase.js +35 -6
- package/dist/tools/ledgerHandlers.js +35 -0
- package/dist/utils/analytics.js +1 -1
- package/dist/utils/llm/adapters/gemini.js +52 -1
- package/dist/utils/llm/adapters/openai.js +38 -2
- package/dist/utils/localLlm.js +1 -1
- package/dist/utils/universalImporter.js +12 -11
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -157,8 +157,45 @@ Categories: abstention, adversarial traps, cascade, disambiguation, edge cases,
|
|
|
157
157
|
### š L3 Grounding Verifier
|
|
158
158
|
When `prism_infer` receives an `evidence` payload, the grounding verifier automatically checks the model's response against the provided evidence before returning to the caller. Unverified or hallucinated claims are flagged. This is the third layer (L3) of the cascade ā after tool routing (L1) and confidence gating (L2).
|
|
159
159
|
|
|
160
|
-
### ā” Zero-search retrieval
|
|
161
|
-
Holographic Reduced Representations (HRR) for instant
|
|
160
|
+
### ā” Zero-search retrieval *(new in v15.8)*
|
|
161
|
+
Holographic Reduced Representations (HRR) via Rust WASM for instant memory retrieval without a database query.
|
|
162
|
+
|
|
163
|
+
**Three adaptive strategies:**
|
|
164
|
+
- **GloVe embeddings** (offline, 50K words) ā 87% Top-1 accuracy, stable at 200+ concepts
|
|
165
|
+
- **API embeddings** (Gemini/Voyage) ā 90%+ accuracy when online
|
|
166
|
+
- **NeurIPS 2021 projection** ā unit-modulus normalization for numerical stability
|
|
167
|
+
|
|
168
|
+
**Retrieval cascade:** HRR (~0.2ms) ā FTS5 (~50ms) ā Supabase (~200ms)
|
|
169
|
+
|
|
170
|
+
| Metric | HRR (WASM) | FTS5 | Supabase Vector |
|
|
171
|
+
|--------|-----------|------|-----------------|
|
|
172
|
+
| Latency | **0.2ms** | 50ms | 200ms |
|
|
173
|
+
| Speedup | **1x** | 250x slower | 1000x slower |
|
|
174
|
+
| Offline | **Yes** | Yes | No |
|
|
175
|
+
| Accuracy (GloVe) | **87% Top-1** | 95%+ | 95%+ |
|
|
176
|
+
| Hologram size | **8KB** | Index varies | Cloud |
|
|
177
|
+
|
|
178
|
+
HRR acts as Tier 0 ā if confidence is high, FTS5 is skipped entirely. Falls through gracefully when HRR has no match. 97 dedicated tests (72 system + 25 API/client). Built with Rust + `rustfft` + `wasm-bindgen` (229KB binary).
|
|
179
|
+
|
|
180
|
+
**HRR AAC prediction benchmark** ā real-world impact on Prism AAC word prediction (10 scenarios, 54 integration tests):
|
|
181
|
+
|
|
182
|
+
| Scenario | Baseline Top-1 | +HRR Top-1 | Top-1 Lift | MRR Lift |
|
|
183
|
+
|----------|---------------|------------|-----------|----------|
|
|
184
|
+
| Core AAC phrases | 36.7% | 46.7% | **+27.3%** | +6.0% |
|
|
185
|
+
| Personal vocabulary | 70.4% | 81.5% | **+15.8%** | +9.2% |
|
|
186
|
+
| Mixed (all phrases) | 47.2% | 56.9% | **+20.6%** | +5.7% |
|
|
187
|
+
| Cross-session recall | 80.0% | 80.0% | +0.0% | +0.0% |
|
|
188
|
+
|
|
189
|
+
Top-1 = correct word is tile #1. MRR = Mean Reciprocal Rank. Zero Top-5 regressions in any scenario. HRR encodes bigrams + trigrams from every spoken phrase; probes take ~0.2ms ā safe on every keystroke. All Synalux apps (clinical, AAC, PrismCoach) share HRR via the portal `/api/v1/hrr` endpoint.
|
|
190
|
+
|
|
191
|
+
**Competitive comparison:**
|
|
192
|
+
|
|
193
|
+
| System | Retrieval | Offline | Cost | Latency |
|
|
194
|
+
|--------|-----------|---------|------|---------|
|
|
195
|
+
| **Prism Coder** | **HRR + FTS5 + Supabase cascade** | **Yes** | **$0** | **0.2ms** |
|
|
196
|
+
| Mem0 | Vector DB (Qdrant/Pinecone) | No | $249/mo | ~100ms |
|
|
197
|
+
| Zep | Vector DB + temporal graph | No | $99/mo | ~80ms |
|
|
198
|
+
| Hermes (NousResearch) | HRR + SQLite | Yes | Free | ~5ms |
|
|
162
199
|
|
|
163
200
|
### š Multi-agent Hivemind
|
|
164
201
|
Multiple AI agents share the same Mind Palace. Each agent has a role (dev / qa / pm / etc.) and sees scoped context. Heartbeat + roster for coordination.
|
|
@@ -436,7 +473,7 @@ prism register-models # Alias dcostenco/prism-coder:* ā prism-coder:*
|
|
|
436
473
|
## Testing
|
|
437
474
|
|
|
438
475
|
```bash
|
|
439
|
-
npm test #
|
|
476
|
+
npm test # 2,418 test cases across 81 files (vitest)
|
|
440
477
|
npm test -- --coverage # coverage report
|
|
441
478
|
python3 tests/benchmarks/prism-routing-100/benchmark.py --models 1b7 14b 32b
|
|
442
479
|
```
|
|
@@ -444,12 +481,16 @@ python3 tests/benchmarks/prism-routing-100/benchmark.py --models 1b7 14b 32b
|
|
|
444
481
|
**Pinned in CI** ā 327 tests enforce every constant: ACT-R decay `d=0.25`, spreading-activation hybrid score `0.7/0.3`, experience bias `MIN_SAMPLES=5` / `MAX_BIAS_CAP=0.15`, graph-metrics warning ratios `0.20 / 0.30 / 0.40`, compaction's 25KB prompt-budget. CI catches divergence automatically.
|
|
445
482
|
|
|
446
483
|
**Coverage areas**:
|
|
447
|
-
- HRR (
|
|
448
|
-
-
|
|
484
|
+
- HRR zero-search retrieval (97 tests: 3 embedding strategies, edge cases, persistence, adaptive cascade, API client, chat integration)
|
|
485
|
+
- Knowledge ingestion (32 tests: chunker, Q&A gen, webhook, security, storage round-trip)
|
|
486
|
+
- Prism infer cascade (110 tests: tier selection, cloud fallback, grounding verifier)
|
|
487
|
+
- Compaction handler (rollup creation, concurrency guard, LLM failure)
|
|
488
|
+
- Model picker (20 tests: 14b default ceiling, 4b verifier, RAM gating)
|
|
489
|
+
- Storage round-trip (12 architectural guard tests preventing bypass)
|
|
449
490
|
- BCBA skill integration
|
|
450
491
|
- Deep storage tier
|
|
451
492
|
- Dashboard rendering
|
|
452
|
-
- Routing benchmarks (
|
|
493
|
+
- Routing benchmarks (eval_300: 300 cases, 17 tools)
|
|
453
494
|
|
|
454
495
|
## Migration
|
|
455
496
|
|
package/dist/aba-protocol.js
CHANGED
|
@@ -70,7 +70,7 @@ export const RULE7_VSCODE = [
|
|
|
70
70
|
].join('\n');
|
|
71
71
|
// āāā Assemblers āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
72
72
|
/** Assemble the full ABA protocol for Cloud Portal */
|
|
73
|
-
|
|
73
|
+
function _unused_buildCloudPrompt(toolsSection) {
|
|
74
74
|
return [
|
|
75
75
|
toolsSection,
|
|
76
76
|
'',
|
|
@@ -106,7 +106,7 @@ export function sanitizeUserInput(text) {
|
|
|
106
106
|
return sanitizeMcpOutput(text);
|
|
107
107
|
}
|
|
108
108
|
/** Wrap user input in <user_input> tags after sanitization */
|
|
109
|
-
|
|
109
|
+
function _unused_wrapUserInput(text) {
|
|
110
110
|
const safe = sanitizeUserInput(text);
|
|
111
111
|
return `<user_input>\n${safe}\n</user_input>`;
|
|
112
112
|
}
|
package/dist/hivemindWatchdog.js
CHANGED
|
@@ -66,7 +66,7 @@ export function drainAlerts(project) {
|
|
|
66
66
|
/**
|
|
67
67
|
* Get count of pending alerts (for testing/debugging).
|
|
68
68
|
*/
|
|
69
|
-
|
|
69
|
+
function _unused_getPendingAlertCount() {
|
|
70
70
|
return pendingAlerts.size;
|
|
71
71
|
}
|
|
72
72
|
// āāā Watchdog Lifecycle āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
package/dist/storage/sqlite.js
CHANGED
|
@@ -1183,6 +1183,33 @@ export class SqliteStorage {
|
|
|
1183
1183
|
version: result.rows[0].version,
|
|
1184
1184
|
};
|
|
1185
1185
|
}
|
|
1186
|
+
async patchHandoff(project, userId, data) {
|
|
1187
|
+
const ALLOWED_COLUMNS = new Set([
|
|
1188
|
+
'embedding', 'embedding_compressed', 'embedding_format', 'embedding_turbo_radius',
|
|
1189
|
+
]);
|
|
1190
|
+
const sets = [];
|
|
1191
|
+
const args = [];
|
|
1192
|
+
for (const [key, value] of Object.entries(data)) {
|
|
1193
|
+
if (!ALLOWED_COLUMNS.has(key)) {
|
|
1194
|
+
throw new Error(`[SqliteStorage] patchHandoff: rejected unknown column "${key}".`);
|
|
1195
|
+
}
|
|
1196
|
+
if (key === "embedding") {
|
|
1197
|
+
sets.push(`${key} = vector(?)`);
|
|
1198
|
+
args.push((typeof value === "string" ? value : JSON.stringify(value)));
|
|
1199
|
+
}
|
|
1200
|
+
else {
|
|
1201
|
+
sets.push(`${key} = ?`);
|
|
1202
|
+
args.push((typeof value === "object" && value !== null ? JSON.stringify(value) : value));
|
|
1203
|
+
}
|
|
1204
|
+
}
|
|
1205
|
+
if (sets.length === 0)
|
|
1206
|
+
return;
|
|
1207
|
+
args.push(project, userId);
|
|
1208
|
+
await this.db.execute({
|
|
1209
|
+
sql: `UPDATE session_handoffs SET ${sets.join(", ")} WHERE project = ? AND user_id = ?`,
|
|
1210
|
+
args,
|
|
1211
|
+
});
|
|
1212
|
+
}
|
|
1186
1213
|
async deleteHandoff(project, userId) {
|
|
1187
1214
|
await this.db.execute({
|
|
1188
1215
|
sql: "DELETE FROM session_handoffs WHERE project = ? AND user_id = ?",
|
package/dist/storage/supabase.js
CHANGED
|
@@ -161,6 +161,12 @@ export class SupabaseStorage {
|
|
|
161
161
|
};
|
|
162
162
|
}
|
|
163
163
|
}
|
|
164
|
+
async patchHandoff(project, userId, data) {
|
|
165
|
+
await supabasePatch("session_handoffs", data, {
|
|
166
|
+
project: `eq.${project}`,
|
|
167
|
+
user_id: `eq.${userId}`,
|
|
168
|
+
});
|
|
169
|
+
}
|
|
164
170
|
async deleteHandoff(project, userId) {
|
|
165
171
|
await supabaseDelete("session_handoffs", {
|
|
166
172
|
project: `eq.${project}`,
|
|
@@ -285,12 +291,36 @@ export class SupabaseStorage {
|
|
|
285
291
|
queryParams.project = `eq.${params.project}`;
|
|
286
292
|
if (params.role)
|
|
287
293
|
queryParams.role = `eq.${params.role}`;
|
|
288
|
-
const
|
|
294
|
+
const ledgerRows = await supabaseGet("session_ledger", queryParams);
|
|
295
|
+
// Also fetch handoff entries with embeddings
|
|
296
|
+
const handoffParams = {
|
|
297
|
+
user_id: `eq.${params.userId}`,
|
|
298
|
+
embedding_compressed: "not.is.null",
|
|
299
|
+
select: "id,project,last_summary,active_decisions,updated_at,embedding_compressed,embedding_turbo_radius",
|
|
300
|
+
limit: "500",
|
|
301
|
+
};
|
|
302
|
+
if (params.project)
|
|
303
|
+
handoffParams.project = `eq.${params.project}`;
|
|
304
|
+
if (params.role)
|
|
305
|
+
handoffParams.role = `eq.${params.role}`;
|
|
306
|
+
const handoffRows = await supabaseGet("session_handoffs", handoffParams);
|
|
307
|
+
// Normalize handoff rows to match ledger shape for scoring
|
|
308
|
+
const normalizedHandoffs = (Array.isArray(handoffRows) ? handoffRows : []).map(h => ({
|
|
309
|
+
...h,
|
|
310
|
+
summary: h.last_summary || "",
|
|
311
|
+
decisions: h.active_decisions || [],
|
|
312
|
+
files_changed: [],
|
|
313
|
+
session_date: h.updated_at,
|
|
314
|
+
created_at: h.updated_at,
|
|
315
|
+
}));
|
|
316
|
+
const rows = [
|
|
317
|
+
...(Array.isArray(ledgerRows) ? ledgerRows : []),
|
|
318
|
+
...normalizedHandoffs,
|
|
319
|
+
];
|
|
289
320
|
const scored = [];
|
|
290
|
-
// v9.3: Import tiebreaker config for optional residualNorm ranking
|
|
291
321
|
const { PRISM_TURBOQUANT_TIEBREAKER_EPSILON } = await import("../config.js");
|
|
292
322
|
const eps = PRISM_TURBOQUANT_TIEBREAKER_EPSILON;
|
|
293
|
-
for (const row of
|
|
323
|
+
for (const row of rows) {
|
|
294
324
|
try {
|
|
295
325
|
const compressedBase64 = row.embedding_compressed;
|
|
296
326
|
const buf = Buffer.from(compressedBase64, "base64");
|
|
@@ -313,7 +343,6 @@ export class SupabaseStorage {
|
|
|
313
343
|
// Skip entries with corrupt compressed data
|
|
314
344
|
}
|
|
315
345
|
}
|
|
316
|
-
// Sort by similarity descending, with optional residualNorm tiebreaker
|
|
317
346
|
scored.sort((a, b) => {
|
|
318
347
|
const diff = b.similarity - a.similarity;
|
|
319
348
|
if (eps > 0 && Math.abs(diff) < eps && a._residualNorm != null && b._residualNorm != null) {
|
|
@@ -321,8 +350,8 @@ export class SupabaseStorage {
|
|
|
321
350
|
}
|
|
322
351
|
return diff;
|
|
323
352
|
});
|
|
324
|
-
debugLog(`[SupabaseStorage] Tier-2 TurboQuant fallback: scored ${rows.length} entries
|
|
325
|
-
|
|
353
|
+
debugLog(`[SupabaseStorage] Tier-2 TurboQuant fallback: scored ${rows.length} entries ` +
|
|
354
|
+
`(${ledgerRows.length} ledger + ${handoffRows.length} handoff), ${scored.length} above threshold`);
|
|
326
355
|
const results = scored.slice(0, params.limit);
|
|
327
356
|
// Strip internal tiebreaker field before returning
|
|
328
357
|
for (const r of results)
|
|
@@ -400,6 +400,40 @@ export async function sessionSaveHandoffHandler(args, server) {
|
|
|
400
400
|
};
|
|
401
401
|
storage.saveHistorySnapshot(snapshotEntry).catch(err => console.error(`[session_save_handoff] History snapshot failed (non-fatal): ${err instanceof Error ? err.message : String(err)}`));
|
|
402
402
|
}
|
|
403
|
+
// āāā Fire-and-forget embedding generation (enables semantic search on handoffs) āāā
|
|
404
|
+
if (data.status === "created" || data.status === "updated") {
|
|
405
|
+
const embeddingText = [
|
|
406
|
+
last_summary || "",
|
|
407
|
+
key_context || "",
|
|
408
|
+
...(open_todos || []),
|
|
409
|
+
].filter(Boolean).join("\n");
|
|
410
|
+
if (embeddingText.trim()) {
|
|
411
|
+
getLLMProvider().generateEmbedding(embeddingText)
|
|
412
|
+
.then(async (embedding) => {
|
|
413
|
+
const patchData = {
|
|
414
|
+
embedding: JSON.stringify(embedding),
|
|
415
|
+
};
|
|
416
|
+
try {
|
|
417
|
+
const { getDefaultCompressor, serialize } = await import("../utils/turboquant.js");
|
|
418
|
+
const compressor = getDefaultCompressor();
|
|
419
|
+
const compressed = compressor.compress(embedding);
|
|
420
|
+
const buf = serialize(compressed);
|
|
421
|
+
patchData.embedding_compressed = buf.toString("base64");
|
|
422
|
+
patchData.embedding_format = `turbo${compressor.bits}`;
|
|
423
|
+
patchData.embedding_turbo_radius = compressed.radius;
|
|
424
|
+
debugLog(`[session_save_handoff] TurboQuant compressed: ${buf.length} bytes`);
|
|
425
|
+
}
|
|
426
|
+
catch (turboErr) {
|
|
427
|
+
console.error(`[session_save_handoff] TurboQuant compression failed (non-fatal): ${turboErr.message}`);
|
|
428
|
+
}
|
|
429
|
+
await storage.patchHandoff(project, PRISM_USER_ID, patchData);
|
|
430
|
+
debugLog(`[session_save_handoff] Embedding saved for project "${project}"`);
|
|
431
|
+
})
|
|
432
|
+
.catch((err) => {
|
|
433
|
+
console.error(`[session_save_handoff] Embedding generation failed (non-fatal): ${err instanceof Error ? err.message : String(err)}`);
|
|
434
|
+
});
|
|
435
|
+
}
|
|
436
|
+
}
|
|
403
437
|
// āāā Trigger resource subscription notification āāā
|
|
404
438
|
if (server && (data.status === "created" || data.status === "updated")) {
|
|
405
439
|
try {
|
|
@@ -523,6 +557,7 @@ export async function sessionSaveHandoffHandler(args, server) {
|
|
|
523
557
|
(last_summary ? `Last summary: ${last_summary}\n` : "") +
|
|
524
558
|
(open_todos?.length ? `Open TODOs: ${open_todos.length} items\n` : "") +
|
|
525
559
|
(active_branch ? `Active branch: ${active_branch}\n` : "") +
|
|
560
|
+
`š Embedding generation queued for semantic search.\n` +
|
|
526
561
|
`\nš Remember: pass expected_version: ${newVersion} on your next save ` +
|
|
527
562
|
`to maintain concurrency control.`;
|
|
528
563
|
return {
|
package/dist/utils/analytics.js
CHANGED
|
@@ -33,7 +33,7 @@ function estimateTokens(text) {
|
|
|
33
33
|
* Call this from server.ts after each tool handler completes.
|
|
34
34
|
* Uses a write buffer to avoid per-call SQLite overhead.
|
|
35
35
|
*/
|
|
36
|
-
|
|
36
|
+
function _unused_recordInvocation(tool, project, args, response, durationMs, success, errorMessage) {
|
|
37
37
|
const invocation = {
|
|
38
38
|
id: `${Date.now()}-${Math.random().toString(36).slice(2, 8)}`,
|
|
39
39
|
tool,
|
|
@@ -77,17 +77,67 @@ export class GeminiAdapter {
|
|
|
77
77
|
return result.response.text();
|
|
78
78
|
}
|
|
79
79
|
// āāā Embedding Generation āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
80
|
+
static _embeddingCache = new Map();
|
|
81
|
+
static _inflight = new Map();
|
|
82
|
+
static EMBED_CACHE_MAX = 256;
|
|
83
|
+
static EMBED_CACHE_TTL_MS = 5 * 60 * 1000;
|
|
84
|
+
getCachedEmbedding(key) {
|
|
85
|
+
const entry = GeminiAdapter._embeddingCache.get(key);
|
|
86
|
+
if (!entry)
|
|
87
|
+
return null;
|
|
88
|
+
if (Date.now() - entry.ts > GeminiAdapter.EMBED_CACHE_TTL_MS) {
|
|
89
|
+
GeminiAdapter._embeddingCache.delete(key);
|
|
90
|
+
return null;
|
|
91
|
+
}
|
|
92
|
+
// Move to tail for LRU on read
|
|
93
|
+
GeminiAdapter._embeddingCache.delete(key);
|
|
94
|
+
GeminiAdapter._embeddingCache.set(key, entry);
|
|
95
|
+
return entry.embedding;
|
|
96
|
+
}
|
|
97
|
+
setCachedEmbedding(key, embedding) {
|
|
98
|
+
// Delete-then-set moves the key to tail for correct LRU eviction
|
|
99
|
+
GeminiAdapter._embeddingCache.delete(key);
|
|
100
|
+
if (GeminiAdapter._embeddingCache.size >= GeminiAdapter.EMBED_CACHE_MAX) {
|
|
101
|
+
const oldest = GeminiAdapter._embeddingCache.keys().next().value;
|
|
102
|
+
if (oldest !== undefined)
|
|
103
|
+
GeminiAdapter._embeddingCache.delete(oldest);
|
|
104
|
+
}
|
|
105
|
+
GeminiAdapter._embeddingCache.set(key, { embedding, ts: Date.now() });
|
|
106
|
+
}
|
|
80
107
|
async generateEmbedding(text) {
|
|
81
108
|
// Guard: empty string would produce a useless/degenerate embedding.
|
|
82
109
|
// Better to fail loudly here than store a zero-vector in the DB.
|
|
83
110
|
if (!text || !text.trim()) {
|
|
84
111
|
throw new Error("Cannot generate embedding for empty text.");
|
|
85
112
|
}
|
|
113
|
+
const trimmedText = text.trim();
|
|
114
|
+
const cacheKey = `${trimmedText.substring(0, 500)}|L${trimmedText.length}`;
|
|
115
|
+
const cached = this.getCachedEmbedding(cacheKey);
|
|
116
|
+
if (cached) {
|
|
117
|
+
debugLog(`[GeminiAdapter] Embedding cache HIT`);
|
|
118
|
+
return cached;
|
|
119
|
+
}
|
|
120
|
+
// In-flight dedup: if another call is already generating this embedding, await it
|
|
121
|
+
const inflight = GeminiAdapter._inflight.get(cacheKey);
|
|
122
|
+
if (inflight) {
|
|
123
|
+
debugLog(`[GeminiAdapter] Embedding in-flight dedup HIT`);
|
|
124
|
+
return inflight;
|
|
125
|
+
}
|
|
126
|
+
const promise = this._generateEmbeddingImpl(trimmedText, cacheKey);
|
|
127
|
+
GeminiAdapter._inflight.set(cacheKey, promise);
|
|
128
|
+
try {
|
|
129
|
+
return await promise;
|
|
130
|
+
}
|
|
131
|
+
finally {
|
|
132
|
+
GeminiAdapter._inflight.delete(cacheKey);
|
|
133
|
+
}
|
|
134
|
+
}
|
|
135
|
+
async _generateEmbeddingImpl(inputTextRaw, cacheKey) {
|
|
86
136
|
// āā Truncation Guard āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
87
137
|
// gemini-embedding-001 has a ~2048 token context window.
|
|
88
138
|
// Long session summaries (esp. code-heavy ones) can easily exceed this.
|
|
89
139
|
// We truncate proactively rather than let the API return a 400 error.
|
|
90
|
-
let inputText =
|
|
140
|
+
let inputText = inputTextRaw;
|
|
91
141
|
if (inputText.length > MAX_EMBEDDING_CHARS) {
|
|
92
142
|
debugLog(`[GeminiAdapter] Embedding input truncated from ${inputText.length}` +
|
|
93
143
|
` to ~${MAX_EMBEDDING_CHARS} chars (word-safe)`);
|
|
@@ -130,6 +180,7 @@ export class GeminiAdapter {
|
|
|
130
180
|
throw new Error(`Embedding dimension mismatch: expected ${EMBEDDING_DIMS},` +
|
|
131
181
|
` got ${values?.length ?? "unknown"}`);
|
|
132
182
|
}
|
|
183
|
+
this.setCachedEmbedding(cacheKey, values);
|
|
133
184
|
return values;
|
|
134
185
|
}
|
|
135
186
|
// āāā Image Description (VLM) āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
@@ -102,18 +102,47 @@ export class OpenAIAdapter {
|
|
|
102
102
|
return response.choices[0]?.message?.content ?? "";
|
|
103
103
|
}
|
|
104
104
|
// āāā Embedding Generation āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
105
|
+
static _embeddingCache = new Map();
|
|
106
|
+
static _inflight = new Map();
|
|
107
|
+
static EMBED_CACHE_MAX = 256;
|
|
108
|
+
static EMBED_CACHE_TTL_MS = 5 * 60 * 1000;
|
|
105
109
|
async generateEmbedding(text) {
|
|
106
110
|
// Guard: empty input produces a degenerate embedding ā fail loudly.
|
|
107
111
|
if (!text || !text.trim()) {
|
|
108
112
|
throw new Error("Cannot generate embedding for empty text.");
|
|
109
113
|
}
|
|
110
|
-
|
|
114
|
+
const trimmedText = text.trim();
|
|
111
115
|
const model = getSettingSync("openai_embedding_model", "text-embedding-3-small");
|
|
116
|
+
const cacheKey = `${model}|${trimmedText.substring(0, 500)}|L${trimmedText.length}`;
|
|
117
|
+
const entry = OpenAIAdapter._embeddingCache.get(cacheKey);
|
|
118
|
+
if (entry && Date.now() - entry.ts < OpenAIAdapter.EMBED_CACHE_TTL_MS) {
|
|
119
|
+
debugLog(`[OpenAIAdapter] Embedding cache HIT`);
|
|
120
|
+
// Move to tail for LRU on read
|
|
121
|
+
OpenAIAdapter._embeddingCache.delete(cacheKey);
|
|
122
|
+
OpenAIAdapter._embeddingCache.set(cacheKey, entry);
|
|
123
|
+
return entry.embedding;
|
|
124
|
+
}
|
|
125
|
+
// In-flight dedup
|
|
126
|
+
const inflight = OpenAIAdapter._inflight.get(cacheKey);
|
|
127
|
+
if (inflight) {
|
|
128
|
+
debugLog(`[OpenAIAdapter] Embedding in-flight dedup HIT`);
|
|
129
|
+
return inflight;
|
|
130
|
+
}
|
|
131
|
+
const promise = this._generateEmbeddingImpl(trimmedText, cacheKey, model);
|
|
132
|
+
OpenAIAdapter._inflight.set(cacheKey, promise);
|
|
133
|
+
try {
|
|
134
|
+
return await promise;
|
|
135
|
+
}
|
|
136
|
+
finally {
|
|
137
|
+
OpenAIAdapter._inflight.delete(cacheKey);
|
|
138
|
+
}
|
|
139
|
+
}
|
|
140
|
+
async _generateEmbeddingImpl(inputTextRaw, cacheKey, model) {
|
|
112
141
|
// āā Truncation Guard āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
113
142
|
// text-embedding-3-small accepts up to 8191 tokens.
|
|
114
143
|
// We apply the same preventive truncation as GeminiAdapter so behavior
|
|
115
144
|
// is consistent regardless of which provider is active.
|
|
116
|
-
let inputText =
|
|
145
|
+
let inputText = inputTextRaw;
|
|
117
146
|
if (inputText.length > MAX_EMBEDDING_CHARS) {
|
|
118
147
|
debugLog(`[OpenAIAdapter] Embedding input truncated from ${inputText.length}` +
|
|
119
148
|
` to ~${MAX_EMBEDDING_CHARS} chars (word-safe)`);
|
|
@@ -148,6 +177,13 @@ export class OpenAIAdapter {
|
|
|
148
177
|
`If using a local model, use one that natively outputs ${EMBEDDING_DIMS} dims ` +
|
|
149
178
|
`(e.g. nomic-embed-text) or supports the Matryoshka 'dimensions' parameter.`);
|
|
150
179
|
}
|
|
180
|
+
OpenAIAdapter._embeddingCache.delete(cacheKey);
|
|
181
|
+
if (OpenAIAdapter._embeddingCache.size >= OpenAIAdapter.EMBED_CACHE_MAX) {
|
|
182
|
+
const oldest = OpenAIAdapter._embeddingCache.keys().next().value;
|
|
183
|
+
if (oldest !== undefined)
|
|
184
|
+
OpenAIAdapter._embeddingCache.delete(oldest);
|
|
185
|
+
}
|
|
186
|
+
OpenAIAdapter._embeddingCache.set(cacheKey, { embedding, ts: Date.now() });
|
|
151
187
|
return embedding;
|
|
152
188
|
}
|
|
153
189
|
// āāā Image Description (VLM) āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
package/dist/utils/localLlm.js
CHANGED
|
@@ -201,7 +201,7 @@ export async function callLocalLlm(userPrompt, model = PRISM_LOCAL_LLM_MODEL, sy
|
|
|
201
201
|
*
|
|
202
202
|
* @returns true if Ollama responds to /api/tags within 3 seconds.
|
|
203
203
|
*/
|
|
204
|
-
|
|
204
|
+
async function _unused_isLocalLlmAvailable() {
|
|
205
205
|
if (!PRISM_LOCAL_LLM_ENABLED)
|
|
206
206
|
return false;
|
|
207
207
|
try {
|
|
@@ -36,6 +36,7 @@
|
|
|
36
36
|
* For ambiguous files, --format= is mandatory.
|
|
37
37
|
* āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
38
38
|
*/
|
|
39
|
+
import { debugLog } from "./logger.js";
|
|
39
40
|
import { getStorage } from "../storage/index.js";
|
|
40
41
|
import { claudeAdapter } from "./migration/claudeAdapter.js";
|
|
41
42
|
import { geminiAdapter } from "./migration/geminiAdapter.js";
|
|
@@ -128,16 +129,16 @@ export async function universalImporter(options) {
|
|
|
128
129
|
if (sniffed) {
|
|
129
130
|
adapter = adapters.find((a) => a.id === sniffed);
|
|
130
131
|
if (adapter) {
|
|
131
|
-
|
|
132
|
+
debugLog(`š Auto-detected format: ${sniffed} (via content sniffing)`);
|
|
132
133
|
}
|
|
133
134
|
}
|
|
134
135
|
}
|
|
135
136
|
if (!adapter) {
|
|
136
137
|
throw new Error(`Could not determine adapter for file: ${filePathArg}. Use --format to specify.`);
|
|
137
138
|
}
|
|
138
|
-
|
|
139
|
+
debugLog(`š Starting migration from ${adapter.id} to Prism...`);
|
|
139
140
|
if (dryRun)
|
|
140
|
-
|
|
141
|
+
debugLog("ā ļø DRY RUN MODE - storage writes disabled.");
|
|
141
142
|
// āā Storage + Concurrency āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
142
143
|
const storage = await getStorage();
|
|
143
144
|
const limit = pLimit(5);
|
|
@@ -169,7 +170,7 @@ export async function universalImporter(options) {
|
|
|
169
170
|
conversationCount++;
|
|
170
171
|
if (verbose) {
|
|
171
172
|
const turnCount = turns.length;
|
|
172
|
-
|
|
173
|
+
debugLog(`š¦ Conversation #${conversationCount}: ${turnCount} turns (${sessionDate}) ā ${conversationId}`);
|
|
173
174
|
}
|
|
174
175
|
if (dryRun) {
|
|
175
176
|
successCount += turns.length;
|
|
@@ -188,7 +189,7 @@ export async function universalImporter(options) {
|
|
|
188
189
|
if (existing.length > 0) {
|
|
189
190
|
skipCount += turns.length;
|
|
190
191
|
if (verbose) {
|
|
191
|
-
|
|
192
|
+
debugLog(`āļø Skipping duplicate: ${conversationId}`);
|
|
192
193
|
}
|
|
193
194
|
return;
|
|
194
195
|
}
|
|
@@ -229,13 +230,13 @@ export async function universalImporter(options) {
|
|
|
229
230
|
// āā Final Flush āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
|
|
230
231
|
// Flush the last conversation (no trailing time gap to trigger it)
|
|
231
232
|
await flushConversation();
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
233
|
+
debugLog("\nā
Migration complete!");
|
|
234
|
+
debugLog(` Conversations: ${conversationCount}`);
|
|
235
|
+
debugLog(` Turns processed: ${successCount}`);
|
|
235
236
|
if (skipCount > 0)
|
|
236
|
-
|
|
237
|
+
debugLog(` Skipped (dup): ${skipCount}`);
|
|
237
238
|
if (failCount > 0)
|
|
238
|
-
|
|
239
|
+
debugLog(` Failed: ${failCount}`);
|
|
239
240
|
return { successCount, failCount, skipCount, conversationCount };
|
|
240
241
|
}
|
|
241
242
|
catch (err) {
|
|
@@ -261,7 +262,7 @@ async function runCLI() {
|
|
|
261
262
|
const dryRun = args.includes("--dry-run") || args.includes("-d");
|
|
262
263
|
const verbose = args.includes("--verbose") || args.includes("-v");
|
|
263
264
|
if (!filePathArg) {
|
|
264
|
-
|
|
265
|
+
debugLog(`
|
|
265
266
|
Prism Universal History Importer
|
|
266
267
|
Usage: node universalImporter.js <file> [options]
|
|
267
268
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "prism-mcp-server",
|
|
3
|
-
"version": "
|
|
3
|
+
"version": "16.1.0",
|
|
4
4
|
"mcpName": "io.github.dcostenco/prism-coder",
|
|
5
5
|
"description": "Prism Coder ā Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HIPAA-hardened local-first storage, SLERP-optimized GRPO alignment) plus the prism-coder:7b / 14b open-weights LLM fleet.",
|
|
6
6
|
"module": "index.ts",
|