npm - prism-mcp-server - Versions diffs - 7.3.1 → 7.3.3 - Mend

prism-mcp-server 7.3.1 → 7.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +20 -19
package/dist/cli.js +50 -0
package/dist/darkfactory/runner.js +101 -2
package/dist/dashboard/ui.js +2617 -2051
package/dist/dashboard/ui.tmp.js +3475 -0
package/dist/errors.js +29 -0
package/dist/storage/sqlite.js +155 -0
package/dist/storage/supabase.js +116 -0
package/dist/tools/routerExperience.js +14 -0
package/dist/verification/clawValidator.js +2 -1
package/dist/verification/cliHandler.js +325 -0
package/dist/verification/gatekeeper.js +39 -0
package/dist/verification/renameDetector.js +170 -0
package/dist/verification/runner.js +27 -5
package/dist/verification/schema.js +18 -0
package/dist/verification/severityPolicy.js +5 -1
package/package.json +4 -1

package/README.md CHANGED Viewed

@@ -29,7 +29,8 @@ Works with **Claude Desktop · Claude Code · Cursor · Windsurf · Cline · Gem
 - [Use Cases](#-use-cases)
 - [What's New](#-whats-new)
   - [v7.3.1 Dark Factory (Fail-Closed Execution)](#v731--dark-factory-fail-closed-execution-)
-  - [v7.2.0 The "Executive Function" Update (Planned)](#v720--the-executive-function-update-)
+  - [v7.2.0 Verification Harness (Planned)](#v720--verification-harness-front-loaded-testing-)
+  - [v7.4.0 Adversarial Dev Harness (Planned)](#v740--adversarial-dev-harness-anti-sycophancy-)
 - [How Prism Compares](#-how-prism-compares)
 - [Tool Reference](#-tool-reference)
 - [Environment Variables](#environment-variables)
@@ -412,7 +413,7 @@ Soft/hard delete (Art. 17), full export in JSON, Markdown, or Obsidian vault `.z
 **Consulting / multi-project** — Switch between client projects with progressive loading: `quick` (~50 tokens), `standard` (~200), or `deep` (~1000+).
-**Complex refactoring (v7.2 planned)** — Prism’s roadmap adds plan-first execution for multi-step changes with persistent plan-state tracking across sessions.
+**Complex refactoring (v7.2 planned)** — Prism’s roadmap adds verification-first execution for multi-step changes with contract-frozen assertions and gated finalization.
 **Team onboarding** — New team member's agent loads the full project history instantly.
@@ -503,24 +504,24 @@ Prism v7.3.1 implements exactly this: a **3-gate fail-closed pipeline** where ev
 </details>
-### v7.2.0 — The "Executive Function" Update 🔭
-> **Planned roadmap release.** Extends Prism from persistent memory toward autonomous plan execution.
+### v7.2.0 — Verification Harness (Front-Loaded Testing) 🔭
+> **Planned roadmap release.** Extends Prism from passive validation to contract-frozen, machine-verifiable execution gates.
-- 🗺️ **Autonomous Plan Decomposition (planned)** — Proposed `session_plan_decompose` tool to transform ambiguous multi-step goals into a structured task DAG.
-- 🔄 **Self-Healing Execution Loop (planned)** — Proposed plan-state engine to capture failed steps, suggest corrective actions, and re-queue recoverable sub-tasks before escalation.
-- 📉 **DAG Plan Visualizer (planned)** — Proposed dashboard Plan/Goal Monitor to render step progress, dependency state, and execution pivots in real time.
-- 🧠 **Context-Aware Goal Tracking (planned)** — Proposed active-plan injection during context loading so agents track not only prior work but current plan position.
-- ⚙️ **Recursive Tool Chaining (planned)** — Proposed middleware path for lower-latency plan-step updates across complex workflows.
-- 🧪 **Plan Integrity Tests (planned)** — Proposed suite validating plan-state persistence across interruptions and session handoffs.
+- 📋 **Spec-Freeze Contract (planned)** — v7.2 formalizes three artifacts with strict responsibilities: `implementation_plan.md` (**how**), `verification_harness.json` (**proof contract**), and `validation_result` (**immutable outcome record**).
+- 🔐 **Rubric Hash Lock (planned)** — `verification_harness.json` is generated before execution and hash-locked (`rubric_hash`) so criteria cannot drift mid-sprint.
+- 🔬 **Multi-Layer Verification (planned)** — Structured checks across **Data Accuracy**, **Agent Behavior**, and **Pipeline Integrity** using machine-parseable assertions.
+- 🤖 **Adversarial Validation Loop (planned)** — A second validation pass evaluates execution outputs against the frozen contract before progression.
+- 🚦 **Finalization Gates (planned)** — Gate policies (`warn` / `gate` / `abort`) evaluate `validation_result` against the frozen rubric before pipeline completion.
+- 🧠 **Routing Feedback Signals (planned)** — Router learning ingests raw verification signals (`pass_rate`, `critical_failures`, `coverage_score`, `rubric_hash`) for downstream confidence adjustment.
 <details>
 <summary><strong>🔬 Concept Example: Before vs. After v7.2</strong></summary>
 **Scenario:** "Refactor the Auth module and update the unit tests."
-**Before (linear prompting):** The agent executes in sequence but can lose place after errors unless the host prompt restates state.
+**Before:** Criteria emerge during or after coding; verification is inconsistent and hard to audit.
-**After (executive planning):** Agent decomposes to a DAG, executes per-step, recovers from failures via plan-state retries, and resumes from the correct dependency node.
+**After (verification-first):** Plan emits a frozen verification contract first, execution runs, validator emits immutable `validation_result`, and finalization gates enforce rubric compliance.
 </details>
@@ -815,13 +816,13 @@ Requires `PRISM_DARK_FACTORY_ENABLED=true`.
 </details>
 <details>
-<summary><strong>Executive Planning (Planned for v7.2)</strong></summary>
+<summary><strong>Verification Harness (Planned for v7.2)</strong></summary>
 | Tool | Purpose |
 |------|---------|
-| `session_plan_decompose` | Decompose natural language goals into a structured DAG of tasks |
-| `session_plan_step_update` | Atomically update the status/result of a specific sub-task |
-| `session_plan_get_active` | Retrieve the current execution DAG and task statuses |
+| `session_plan_decompose` | Decompose natural language goals into an execution plan that references verification requirements |
+| `session_plan_step_update` | Atomically update step status/result with verification context |
+| `session_plan_get_active` | Retrieve active plan state and current verification gating position |
 </details>
@@ -970,7 +971,7 @@ Prism is evolving from smart session logging toward a **cognitive memory archite
 | **v7.0** | Composite Retrieval Scoring — `0.7 × similarity + 0.3 × σ(activation)`; configurable via `PRISM_ACTR_WEIGHT_*` | Hybrid cognitive-neural retrieval models | ✅ Shipped |
 | **v7.0** | AccessLogBuffer — in-memory batch-write buffer with 5s flush; prevents SQLite `SQLITE_BUSY` under parallel agents | Production reliability engineering | ✅ Shipped |
 | **v7.3** | Dark Factory — 3-gate fail-closed EXECUTE pipeline (parse → type → scope) with structured JSON action contract | Industrial safety systems (defense-in-depth, fail-closed valves) | ✅ Shipped |
-| **v7.2** | Executive Planning & DAG tracking | Prefrontal cortex executive control + Directed Acyclic Graph planning | 🔭 Horizon |
+| **v7.2** | Verification-first harness & contract-gated execution | Programmatic verification systems + adversarial validation loops | 🔭 Horizon |
 | **v7.x** | Affect-Tagged Memory — sentiment shapes what gets recalled | Affect-modulated retrieval (neuroscience) | 🔭 Horizon |
 | **v8+** | Zero-Search Retrieval — no index, no ANN, just ask the vector | Holographic Reduced Representations | 🔭 Horizon |
@@ -997,8 +998,8 @@ Shipped. Deterministic task routing (`session_task_route`) with optional experie
 ### v7.0: ACT-R Activation Memory ✅
 Shipped. Scientifically-grounded retrieval re-ranking via ACT-R base-level activation (`B_i = ln(Σ t_j^(-d))`), candidate-scoped spreading activation, parameterized sigmoid normalization, composite scoring, and zero-cold-start access log infrastructure. 49 dedicated unit tests, 705 total passing.
-### v7.2: Executive Function 🔭
-Planned. Adds autonomous plan decomposition, DAG-backed step tracking, and self-healing execution loops for complex multi-step operations.
+### v7.2: Verification Harness 🔭
+Planned. Adds a spec-frozen verification contract (`implementation_plan.md` + `verification_harness.json` + immutable `validation_result`), multi-layer machine checks, and finalization gates before autonomous completion.
 ### Future Tracks
 - **v7.x: Affect-Tagged Memory** — Recall prioritization improves by weighting memories with affective/contextual valence, making surfaced context more behaviorally useful.

package/dist/cli.js ADDED Viewed

@@ -0,0 +1,50 @@
+#!/usr/bin/env node
+import { Command } from 'commander';
+import { SqliteStorage } from './storage/sqlite.js';
+import { handleVerifyStatus, handleGenerateHarness } from './verification/cliHandler.js';
+import * as path from 'path';
+const program = new Command();
+program
+    .name('prism')
+    .description('Prism Configuration & CLI')
+    .version('7.3.1');
+const verifyCmd = program
+    .command('verify')
+    .description('Manage the verification harness');
+verifyCmd
+    .command('status')
+    .description('Check the current verification state and view config drift')
+    .option('-p, --project <name>', 'Project name', path.basename(process.cwd()))
+    .option('-f, --force', 'Bypass verification failures and drift tracking constraints')
+    .option('-u, --user <id>', 'User ID for tenant isolation', 'default')
+    .option('--json', 'Emit machine-readable JSON output with stable keys')
+    .action(async (options) => {
+    const storage = new SqliteStorage();
+    await storage.initialize('./prism-local.db');
+    // H4 fix: Ensure storage is closed on exit to flush WAL and prevent data loss
+    try {
+        await handleVerifyStatus(storage, options.project, !!options.force, options.user, !!options.json);
+    }
+    finally {
+        await storage.close();
+    }
+});
+verifyCmd
+    .command('generate')
+    .description('Bless the current ./verification_harness.json as the canonical rubric')
+    .option('-p, --project <name>', 'Project name', path.basename(process.cwd()))
+    .option('-f, --force', 'Bypass verification failures and drift tracking constraints')
+    .option('-u, --user <id>', 'User ID for tenant isolation', 'default')
+    .option('--json', 'Emit machine-readable JSON output with stable keys')
+    .action(async (options) => {
+    const storage = new SqliteStorage();
+    await storage.initialize('./prism-local.db');
+    // H4 fix: Ensure storage is closed on exit to flush WAL and prevent data loss
+    try {
+        await handleGenerateHarness(storage, options.project, !!options.force, options.user, !!options.json);
+    }
+    finally {
+        await storage.close();
+    }
+});
+program.parse(process.argv);

package/dist/darkfactory/runner.js CHANGED Viewed

@@ -21,10 +21,15 @@ import { getStorage } from '../storage/index.js';
 import { VALID_ACTION_TYPES } from './schema.js';
 import { SafetyController } from './safetyController.js';
 import { invokeClawAgent } from './clawInvocation.js';
-import { PRISM_DARK_FACTORY_POLL_MS, PRISM_DARK_FACTORY_MAX_RUNTIME_MS, PRISM_USER_ID } from '../config.js';
+import { PRISM_DARK_FACTORY_POLL_MS, PRISM_DARK_FACTORY_MAX_RUNTIME_MS, PRISM_USER_ID, PRISM_VERIFICATION_LAYERS, PRISM_VERIFICATION_DEFAULT_SEVERITY } from '../config.js';
 import { debugLog } from '../utils/logger.js';
 import path from 'path';
 import fs from 'fs';
+import * as crypto from 'crypto';
+import { Gatekeeper } from '../verification/gatekeeper.js';
+import { VerificationRunner } from '../verification/runner.js';
+import { computeRubricHash } from '../verification/schema.js';
+import { VerificationGateError } from '../errors.js';
 /** Interval handle for graceful shutdown */
 let runnerInterval = null;
 /** Tracks whether the runner is currently processing a tick (prevents overlap) */
@@ -482,8 +487,102 @@ async function runnerTick() {
             await emitExperienceEvent(pipeline, 'failure', `Scope violation: ${result.scopeViolation}`);
             return;
         }
-        // Determine next step based on result
         const currentStep = pipeline.current_step;
+        // ── Phase 4: Verification Pipeline Orchestrator ──
+        if (currentStep === 'VERIFY' && spec.workingDirectory) {
+            const harnessPath = path.join(path.resolve(spec.workingDirectory), 'verification_harness.json');
+            if (fs.existsSync(harnessPath)) {
+                try {
+                    const rawHarness = fs.readFileSync(harnessPath, 'utf8');
+                    const harnessData = JSON.parse(rawHarness);
+                    // GAP-5 fix: Persist the harness so CLI drift detection works for DarkFactory runs
+                    const rubricHash = computeRubricHash(harnessData.tests);
+                    const harness = {
+                        ...harnessData,
+                        project: pipeline.project,
+                        conversation_id: `dark-factory-${pipeline.id}`,
+                        created_at: new Date().toISOString(),
+                        rubric_hash: rubricHash,
+                    };
+                    await storage.saveVerificationHarness(harness, pipeline.user_id);
+                    // GAP-2 fix: Build VerificationConfig from env vars so PRISM_VERIFICATION_LAYERS
+                    // and PRISM_VERIFICATION_DEFAULT_SEVERITY are respected in DarkFactory pipelines
+                    const vConfig = {
+                        enabled: true,
+                        layers: PRISM_VERIFICATION_LAYERS,
+                        default_severity: PRISM_VERIFICATION_DEFAULT_SEVERITY,
+                    };
+                    const verificationResult = await VerificationRunner.runSuite(rawHarness, {
+                        harness,
+                        layers: PRISM_VERIFICATION_LAYERS,
+                        config: vConfig,
+                    });
+                    const coverageScore = verificationResult.total > 0 ? (verificationResult.total - verificationResult.skipped_count) / verificationResult.total : 0;
+                    const executedCount = verificationResult.total - verificationResult.skipped_count;
+                    const passRate = executedCount > 0 ? verificationResult.passed_count / executedCount : 0;
+                    // GAP-4 fix: Use proper ValidationResult type instead of `any`
+                    const valResult = {
+                        id: crypto.randomUUID(),
+                        rubric_hash: rubricHash,
+                        project: pipeline.project,
+                        conversation_id: `dark-factory-${pipeline.id}`,
+                        run_at: new Date().toISOString(),
+                        passed: passRate >= harnessData.min_pass_rate && verificationResult.severity_gate.action !== "abort",
+                        pass_rate: passRate,
+                        critical_failures: verificationResult.severity_gate.failed_assertions.length,
+                        coverage_score: coverageScore,
+                        result_json: JSON.stringify(verificationResult),
+                        gate_action: verificationResult.severity_gate.action,
+                        gate_override: false,
+                    };
+                    const { canContinue, validatedResult } = Gatekeeper.executeGate(valResult);
+                    await storage.saveVerificationRun(validatedResult, pipeline.user_id);
+                    // GAP-3 fix: Emit verification experience event for ML routing feedback
+                    try {
+                        const confidenceScore = Math.round(passRate * 100);
+                        await storage.saveLedger({
+                            project: pipeline.project,
+                            conversation_id: `dark-factory-${pipeline.id}`,
+                            user_id: pipeline.user_id,
+                            event_type: 'validation_result',
+                            summary: `[VERIFY] ${verificationResult.passed_count}/${verificationResult.total} passed (gate: ${verificationResult.severity_gate.action})`,
+                            keywords: ['dark-factory', 'verification', pipeline.project],
+                            importance: verificationResult.severity_gate.action === 'abort' ? 2 : 0,
+                            confidence_score: confidenceScore,
+                        });
+                    }
+                    catch { /* experience events are advisory — never block execution */ }
+                    if (!canContinue) {
+                        result.success = false;
+                        result.notes = (result.notes ? result.notes + '\n\n' : '') + `[GATE BLOCKED] Pipeline verification runner failed the security gate.`;
+                    }
+                    else {
+                        result.success = result.success && validatedResult.passed;
+                    }
+                }
+                catch (err) {
+                    if (err instanceof VerificationGateError) {
+                        debugLog(`[DarkFactory] Pipeline ${pipeline.id} ABORTED by Verification Gate.`);
+                        try {
+                            await storage.savePipeline({
+                                ...pipeline,
+                                status: 'FAILED',
+                                error: `[GATE ABORT] ${err.message}`,
+                            });
+                        }
+                        catch { /* Status guard */ }
+                        await emitExperienceEvent(pipeline, 'failure', `[GATE ABORT] ${err.message}`);
+                        return;
+                    }
+                    else {
+                        console.error(`[DarkFactory] Verification harness crash: ${err.message}`);
+                        result.success = false;
+                        result.notes = `[GATE CRASH] Verification suite failed to execute: ${err.message}`;
+                    }
+                }
+            }
+        }
+        // Determine next step based on result
         const nextStep = SafetyController.getNextStep(currentStep, pipeline.iteration, spec, result.success // For VERIFY step: success means tests passed
         );
         if (nextStep === null || currentStep === 'FINALIZE') {