prism-mcp-server 7.2.0 → 7.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +83 -1
- package/dist/config.js +16 -0
- package/dist/darkfactory/clawInvocation.js +77 -0
- package/dist/darkfactory/runner.js +584 -0
- package/dist/darkfactory/safetyController.js +197 -0
- package/dist/darkfactory/schema.js +4 -0
- package/dist/dashboard/server.js +103 -0
- package/dist/dashboard/ui.js +118 -6
- package/dist/hivemindWatchdog.js +197 -4
- package/dist/lifecycle.js +9 -1
- package/dist/server.js +41 -3
- package/dist/storage/sqlite.js +88 -0
- package/dist/storage/supabase.js +79 -3
- package/dist/storage/supabaseMigrations.js +52 -0
- package/dist/tools/index.js +5 -0
- package/dist/tools/pipelineDefinitions.js +131 -0
- package/dist/tools/pipelineHandlers.js +214 -0
- package/dist/tools/sessionMemoryDefinitions.js +5 -3
- package/dist/verification/clawValidator.js +228 -0
- package/dist/verification/runner.js +479 -0
- package/dist/verification/schema.js +46 -0
- package/dist/verification/severityPolicy.js +94 -0
- package/package.json +10 -5
package/README.md
CHANGED
|
@@ -28,7 +28,8 @@ Works with **Claude Desktop · Claude Code · Cursor · Windsurf · Cline · Gem
|
|
|
28
28
|
- [What Makes Prism Different](#-what-makes-prism-different)
|
|
29
29
|
- [Use Cases](#-use-cases)
|
|
30
30
|
- [What's New](#-whats-new)
|
|
31
|
-
- [v7.
|
|
31
|
+
- [v7.3.1 Dark Factory (Fail-Closed Execution)](#v731--dark-factory-fail-closed-execution-)
|
|
32
|
+
- [v7.2.0 The "Executive Function" Update (Planned)](#v720--the-executive-function-update-)
|
|
32
33
|
- [How Prism Compares](#-how-prism-compares)
|
|
33
34
|
- [Tool Reference](#-tool-reference)
|
|
34
35
|
- [Environment Variables](#environment-variables)
|
|
@@ -440,6 +441,68 @@ Then continue a specific thread with a follow-up message to the selected agent,
|
|
|
440
441
|
|
|
441
442
|
## 🆕 What's New
|
|
442
443
|
|
|
444
|
+
### v7.3.1 — Dark Factory (Fail-Closed Execution) 🏭
|
|
445
|
+
> **Current stable release.** Hardened autonomous pipeline execution with a structured JSON action contract.
|
|
446
|
+
|
|
447
|
+
When an AI agent executes code autonomously — no human watching, no approval step — a single hallucinated file path can write outside your project, corrupt sibling repos, or hit system files. This is the "dark factory" problem: **lights-out execution demands machine-enforced safety, not LLM good behavior.**
|
|
448
|
+
|
|
449
|
+
> *"I started building testing harnesses with programmatic checks in the planning phase across 3 layers. I got this idea when I was doing a complex ETL process across 3 databases and I needed to stack 9's on data accuracy, but also across the agent layer. After a considerable amount of hair pulling, I started to front load. It's now part of my lifecycle harness that my dark factory uses by default."*
|
|
450
|
+
> — [Stephen Driggs](https://linkedin.com/in/stephendriggs), VP Product AI at Shift4
|
|
451
|
+
|
|
452
|
+
Prism v7.3.1 implements exactly this: a **3-gate fail-closed pipeline** where every `EXECUTE` step must pass parse, type, and scope validation before any filesystem side effect occurs.
|
|
453
|
+
|
|
454
|
+
- 🔒 **Structured Action Contract** — `EXECUTE` steps must return machine-parseable JSON conforming to `{ actions: [{ type, targetPath, content? }] }`. Free-form text is rejected at the gate.
|
|
455
|
+
- 🛡️ **3-Strategy Defensive Parser** — Raw JSON → fenced code block extraction → brace extraction. Handles adversarial LLM output (preamble text, markdown fences, trailing commentary) without ever executing malformed payloads.
|
|
456
|
+
- ✅ **Type Validation** — Only `READ_FILE | WRITE_FILE | PATCH_FILE | RUN_TEST` are permitted. Novel action types invented by the LLM are rejected.
|
|
457
|
+
- 📏 **Scope Validation** — Every `targetPath` is resolved against the pipeline's `workingDirectory` via `SafetyController.validateActionsInScope()`. Path traversal (`../`), sibling-prefix bypasses, and absolute paths outside the boundary are blocked.
|
|
458
|
+
- 🚫 **Pipeline-Level Termination** — A scope violation doesn't just fail the step — it **terminates the entire pipeline** with `status: FAILED` and emits a `failure` experience event for the ML routing layer.
|
|
459
|
+
|
|
460
|
+
<details>
|
|
461
|
+
<summary><strong>🔬 The 3-Gate Architecture: How a Path Traversal Attack Fails</strong></summary>
|
|
462
|
+
|
|
463
|
+
**Scenario:** An LLM running autonomously in a Dark Factory pipeline targeting `/home/user/my-app` produces this output for an EXECUTE step:
|
|
464
|
+
|
|
465
|
+
```json
|
|
466
|
+
{
|
|
467
|
+
"actions": [
|
|
468
|
+
{ "type": "WRITE_FILE", "targetPath": "src/utils.ts", "content": "// valid" },
|
|
469
|
+
{ "type": "WRITE_FILE", "targetPath": "../../.ssh/authorized_keys", "content": "ssh-rsa ATTACK..." }
|
|
470
|
+
]
|
|
471
|
+
}
|
|
472
|
+
```
|
|
473
|
+
|
|
474
|
+
**Gate 1 — Parse:** ✅ Valid JSON. The 3-strategy parser extracts it cleanly.
|
|
475
|
+
|
|
476
|
+
**Gate 2 — Type:** ✅ Both actions use `WRITE_FILE`, a valid `ActionType`.
|
|
477
|
+
|
|
478
|
+
**Gate 3 — Scope:** 🚫 **BLOCKED.** `SafetyController.validateActionsInScope()` resolves `../../.ssh/authorized_keys` to `/home/user/.ssh/authorized_keys` — which is **outside** `/home/user/my-app`. The violation is detected *before* any write occurs.
|
|
479
|
+
|
|
480
|
+
**Result:** The entire pipeline is terminated immediately. No files are written — not even the valid `src/utils.ts`. A `failure` experience event is emitted so the ML router can learn to avoid this pattern.
|
|
481
|
+
|
|
482
|
+
```
|
|
483
|
+
Pipeline FAILED: Scope violation — action[1] targetPath
|
|
484
|
+
"../../.ssh/authorized_keys" resolves outside workingDirectory
|
|
485
|
+
```
|
|
486
|
+
|
|
487
|
+
**Without v7.3.1:** The LLM's raw text output would be interpreted as instructions, and the agent runtime would attempt the write — potentially succeeding depending on filesystem permissions.
|
|
488
|
+
|
|
489
|
+
**With v7.3.1:** The structured contract makes this class of attack impossible. The LLM never touches the filesystem directly; every action is validated through the 3-gate pipeline first.
|
|
490
|
+
|
|
491
|
+
</details>
|
|
492
|
+
|
|
493
|
+
<details>
|
|
494
|
+
<summary><strong>🧪 Edge Cases Covered (67 tests)</strong></summary>
|
|
495
|
+
|
|
496
|
+
| Category | Examples |
|
|
497
|
+
|----------|----------|
|
|
498
|
+
| **Parse adversarial output** | Prose preamble + JSON, nested fences, empty input, non-string input |
|
|
499
|
+
| **Type coercion** | `"DELETE_FILE"`, `"EXEC_CMD"`, numeric types, null types |
|
|
500
|
+
| **Path traversal** | `../`, `../../`, `/etc/passwd`, null bytes, unicode normalization, embedded newlines |
|
|
501
|
+
| **Shape validation** | Missing `actions` array, non-object actions, empty `targetPath`, root-type coercion |
|
|
502
|
+
| **Stress payloads** | 100-action arrays, 100KB content strings, 500-segment deep paths |
|
|
503
|
+
|
|
504
|
+
</details>
|
|
505
|
+
|
|
443
506
|
### v7.2.0 — The "Executive Function" Update 🔭
|
|
444
507
|
> **Planned roadmap release.** Extends Prism from persistent memory toward autonomous plan execution.
|
|
445
508
|
|
|
@@ -738,6 +801,19 @@ Requires `PRISM_TASK_ROUTER_ENABLED=true` (or dashboard toggle).
|
|
|
738
801
|
|
|
739
802
|
</details>
|
|
740
803
|
|
|
804
|
+
<details>
|
|
805
|
+
<summary><strong>Dark Factory Orchestration (3 tools)</strong></summary>
|
|
806
|
+
|
|
807
|
+
Requires `PRISM_DARK_FACTORY_ENABLED=true`.
|
|
808
|
+
|
|
809
|
+
| Tool | Purpose |
|
|
810
|
+
|------|---------|
|
|
811
|
+
| `session_start_pipeline` | Create and enqueue a background autonomous pipeline |
|
|
812
|
+
| `session_check_pipeline_status` | Poll the current step, iteration, and status of a pipeline |
|
|
813
|
+
| `session_abort_pipeline` | Emergency kill switch to halt a running background pipeline |
|
|
814
|
+
|
|
815
|
+
</details>
|
|
816
|
+
|
|
741
817
|
<details>
|
|
742
818
|
<summary><strong>Executive Planning (Planned for v7.2)</strong></summary>
|
|
743
819
|
|
|
@@ -797,6 +873,7 @@ Requires `PRISM_TASK_ROUTER_ENABLED=true` (or dashboard toggle).
|
|
|
797
873
|
| `PRISM_ACTR_WEIGHT_SIMILARITY` | No | Composite score similarity weight (default: `0.7`) |
|
|
798
874
|
| `PRISM_ACTR_WEIGHT_ACTIVATION` | No | Composite score ACT-R activation weight (default: `0.3`) |
|
|
799
875
|
| `PRISM_ACTR_ACCESS_LOG_RETENTION_DAYS` | No | Days before access logs are pruned by background scheduler (default: `90`) |
|
|
876
|
+
| `PRISM_DARK_FACTORY_ENABLED` | No | `"true"` to enable Dark Factory autonomous pipeline tools (`session_start_pipeline`, `session_check_pipeline_status`, `session_abort_pipeline`) |
|
|
800
877
|
|
|
801
878
|
</details>
|
|
802
879
|
|
|
@@ -827,6 +904,7 @@ Prism is a **stdio-based MCP server** that manages persistent agent memory. Here
|
|
|
827
904
|
│ ↕ │
|
|
828
905
|
│ ┌────────────────────────────────────────────────────┐ │
|
|
829
906
|
│ │ Background Workers │ │
|
|
907
|
+
│ │ • Dark Factory (3-gate fail-closed pipelines) │ │
|
|
830
908
|
│ │ • Scheduler (TTL, decay, compaction, purge) │ │
|
|
831
909
|
│ │ • Web Scholar (Brave → Firecrawl → LLM → Ledger) │ │
|
|
832
910
|
│ │ • Hivemind heartbeats & Telepathy broadcasts │ │
|
|
@@ -891,6 +969,7 @@ Prism is evolving from smart session logging toward a **cognitive memory archite
|
|
|
891
969
|
| **v7.0** | Candidate-Scoped Spreading Activation — `S_i = Σ(W × strength)` bounded to search result set; prevents God-node dominance | Spreading activation networks (Collins & Loftus, 1975) | ✅ Shipped |
|
|
892
970
|
| **v7.0** | Composite Retrieval Scoring — `0.7 × similarity + 0.3 × σ(activation)`; configurable via `PRISM_ACTR_WEIGHT_*` | Hybrid cognitive-neural retrieval models | ✅ Shipped |
|
|
893
971
|
| **v7.0** | AccessLogBuffer — in-memory batch-write buffer with 5s flush; prevents SQLite `SQLITE_BUSY` under parallel agents | Production reliability engineering | ✅ Shipped |
|
|
972
|
+
| **v7.3** | Dark Factory — 3-gate fail-closed EXECUTE pipeline (parse → type → scope) with structured JSON action contract | Industrial safety systems (defense-in-depth, fail-closed valves) | ✅ Shipped |
|
|
894
973
|
| **v7.2** | Executive Planning & DAG tracking | Prefrontal cortex executive control + Directed Acyclic Graph planning | 🔭 Horizon |
|
|
895
974
|
| **v7.x** | Affect-Tagged Memory — sentiment shapes what gets recalled | Affect-modulated retrieval (neuroscience) | 🔭 Horizon |
|
|
896
975
|
| **v8+** | Zero-Search Retrieval — no index, no ANN, just ask the vector | Holographic Reduced Representations | 🔭 Horizon |
|
|
@@ -909,6 +988,9 @@ Shipped in v6.2.0. Edge synthesis, graph pruning with SLO observability, tempora
|
|
|
909
988
|
### v6.5: Cognitive Architecture ✅
|
|
910
989
|
Shipped. Full Superposed Memory (SDM) + Hyperdimensional Computing (HDC/VSA) cognitive routing pipeline. Compositional memory states via XOR binding, Hamming resolution, and policy-gated routing (direct / clarify / fallback). 705 tests passing.
|
|
911
990
|
|
|
991
|
+
### v7.3: Dark Factory — Fail-Closed Execution ✅
|
|
992
|
+
Shipped. Structured JSON action contract for autonomous `EXECUTE` steps. 3-gate validation pipeline (parse → type → scope) terminates pipelines on any violation before filesystem side effects. 67 edge-case tests covering adversarial LLM output, path traversal, and type coercion.
|
|
993
|
+
|
|
912
994
|
### v7.1: Prism Task Router ✅
|
|
913
995
|
Shipped. Deterministic task routing (`session_task_route`) with optional experience-based confidence adjustment for host vs. local Claw delegation.
|
|
914
996
|
|
package/dist/config.js
CHANGED
|
@@ -241,3 +241,19 @@ export const PRISM_TASK_ROUTER_ENABLED_ENV = process.env.PRISM_TASK_ROUTER_ENABL
|
|
|
241
241
|
export const PRISM_TASK_ROUTER_CONFIDENCE_THRESHOLD = parseFloat(process.env.PRISM_TASK_ROUTER_CONFIDENCE_THRESHOLD || "0.6");
|
|
242
242
|
/** Maximum complexity score (1-10) that Claw can handle. Tasks above this → host. (Default: 4) */
|
|
243
243
|
export const PRISM_TASK_ROUTER_MAX_CLAW_COMPLEXITY = parseInt(process.env.PRISM_TASK_ROUTER_MAX_CLAW_COMPLEXITY || "4", 10);
|
|
244
|
+
// ─── v7.2: Verification Harness ──────────────────────────────
|
|
245
|
+
/** Master switch for the v7.2.0 enhanced verification harness. */
|
|
246
|
+
export const PRISM_VERIFICATION_HARNESS_ENABLED = process.env.PRISM_VERIFICATION_HARNESS_ENABLED === "true";
|
|
247
|
+
/** Comma-separated list of verification layers to run. */
|
|
248
|
+
export const PRISM_VERIFICATION_LAYERS = (process.env.PRISM_VERIFICATION_LAYERS || "data,agent,pipeline").split(",").map(l => l.trim()).filter(Boolean);
|
|
249
|
+
/** Default severity floor for all assertions. Overrides individual assertion severity when higher. */
|
|
250
|
+
export const PRISM_VERIFICATION_DEFAULT_SEVERITY = (process.env.PRISM_VERIFICATION_DEFAULT_SEVERITY || "warn");
|
|
251
|
+
// ─── v7.3: Dark Factory Orchestration ─────────────────────────
|
|
252
|
+
// Autonomous pipeline runner: PLAN → EXECUTE → VERIFY → iterate.
|
|
253
|
+
// Opt-in because it executes LLM calls in the background.
|
|
254
|
+
/** Master switch for the Dark Factory background runner. */
|
|
255
|
+
export const PRISM_DARK_FACTORY_ENABLED = process.env.PRISM_DARK_FACTORY_ENABLED === "true"; // Opt-in
|
|
256
|
+
/** Poll interval for the runner loop (ms). Default: 30s. */
|
|
257
|
+
export const PRISM_DARK_FACTORY_POLL_MS = parseInt(process.env.PRISM_DARK_FACTORY_POLL_MS || "30000", 10);
|
|
258
|
+
/** Default max wall-clock time per pipeline (ms). Default: 15 minutes. */
|
|
259
|
+
export const PRISM_DARK_FACTORY_MAX_RUNTIME_MS = parseInt(process.env.PRISM_DARK_FACTORY_MAX_RUNTIME_MS || "900000", 10);
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
import { getLLMProvider } from '../utils/llm/factory.js';
|
|
2
|
+
import { OpenAIAdapter } from '../utils/llm/adapters/openai.js';
|
|
3
|
+
import { SafetyController } from './safetyController.js';
|
|
4
|
+
import { debugLog } from '../utils/logger.js';
|
|
5
|
+
/**
|
|
6
|
+
* JSON output schema instruction injected into EXECUTE step prompts.
|
|
7
|
+
* Forces the LLM to respond with machine-parseable structured output
|
|
8
|
+
* instead of free-form text. The runner validates this shape before
|
|
9
|
+
* any actions are applied.
|
|
10
|
+
*/
|
|
11
|
+
const EXECUTE_JSON_SCHEMA = `
|
|
12
|
+
You MUST respond with ONLY a valid JSON object matching this schema:
|
|
13
|
+
{
|
|
14
|
+
"actions": [
|
|
15
|
+
{
|
|
16
|
+
"type": "READ_FILE" | "WRITE_FILE" | "PATCH_FILE" | "RUN_TEST",
|
|
17
|
+
"targetPath": "<relative path within the workspace>",
|
|
18
|
+
"content": "<file content for WRITE_FILE>",
|
|
19
|
+
"patch": "<patch content for PATCH_FILE>",
|
|
20
|
+
"command": "<test command for RUN_TEST>"
|
|
21
|
+
}
|
|
22
|
+
],
|
|
23
|
+
"notes": "<optional summary of what you did>"
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
RULES:
|
|
27
|
+
- type MUST be one of: READ_FILE, WRITE_FILE, PATCH_FILE, RUN_TEST
|
|
28
|
+
- targetPath MUST be a relative path within the workspace
|
|
29
|
+
- Do NOT include any text outside the JSON object
|
|
30
|
+
- Do NOT use markdown code fences
|
|
31
|
+
- If you cannot complete the task, return: {"actions": [], "notes": "reason"}
|
|
32
|
+
`.trim();
|
|
33
|
+
/**
|
|
34
|
+
* Invocation wrapper that routes payload specs to the local Claw agent model (Qwen 2.5),
|
|
35
|
+
* or the active LLM provider as fallback.
|
|
36
|
+
*
|
|
37
|
+
* Uses SafetyController.generateBoundaryPrompt() for scope injection
|
|
38
|
+
* instead of inline prompt construction — single source of truth for safety rules.
|
|
39
|
+
*
|
|
40
|
+
* v7.3.1: EXECUTE steps request structured JSON output via EXECUTE_JSON_SCHEMA.
|
|
41
|
+
* Non-EXECUTE steps continue to use free-form text output.
|
|
42
|
+
*/
|
|
43
|
+
export async function invokeClawAgent(spec, state, timeoutMs = 120000 // 2 min default timeout for internal executions
|
|
44
|
+
) {
|
|
45
|
+
// BYOM Override: Provide path to use alternative open-source pipelines
|
|
46
|
+
// (e.g. through the OpenAI structured adapter which also points to local endpoints like Ollama / vLLM if configured)
|
|
47
|
+
const llm = spec.modelOverride
|
|
48
|
+
? new OpenAIAdapter() // Bypasses the factory to route locally
|
|
49
|
+
: getLLMProvider();
|
|
50
|
+
// Scope injection via SafetyController — single source of truth
|
|
51
|
+
const systemPrompt = SafetyController.generateBoundaryPrompt(spec, state);
|
|
52
|
+
// v7.3.1: EXECUTE steps get structured JSON output instructions
|
|
53
|
+
const isExecuteStep = state.current_step === 'EXECUTE';
|
|
54
|
+
const executePrompt = isExecuteStep
|
|
55
|
+
? `Based on the system instructions, execute the necessary actions for the current step (${state.current_step}).\n\n${EXECUTE_JSON_SCHEMA}`
|
|
56
|
+
: `Based on the system instructions, execute the necessary task for the current step (${state.current_step}). Respond with your actions and observations.`;
|
|
57
|
+
debugLog(`[ClawInvocation] Launching agent on pipeline ${state.id} step=${state.current_step} iter=${state.iteration} with ${timeoutMs}ms limit.${isExecuteStep ? ' (JSON mode)' : ''}`);
|
|
58
|
+
try {
|
|
59
|
+
// Timeout Promise to ensure the runner thread does not block indefinitely
|
|
60
|
+
const timeboundExecution = Promise.race([
|
|
61
|
+
llm.generateText(executePrompt, systemPrompt),
|
|
62
|
+
new Promise((_, reject) => setTimeout(() => reject(new Error('LLM_EXECUTION_TIMEOUT')), timeoutMs))
|
|
63
|
+
]);
|
|
64
|
+
const result = await timeboundExecution;
|
|
65
|
+
return {
|
|
66
|
+
success: true,
|
|
67
|
+
resultText: result
|
|
68
|
+
};
|
|
69
|
+
}
|
|
70
|
+
catch (error) {
|
|
71
|
+
debugLog(`[ClawInvocation] Exception during generation: ${error.message}`);
|
|
72
|
+
return {
|
|
73
|
+
success: false,
|
|
74
|
+
resultText: error.message
|
|
75
|
+
};
|
|
76
|
+
}
|
|
77
|
+
}
|