npm - @rigour-labs/core - Versions diffs - 4.3.5 → 5.0.0 - Mend

@rigour-labs/core 4.3.5 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/README.md +46 -10
package/dist/gates/base.d.ts +3 -0
package/dist/gates/checkpoint.d.ts +23 -8
package/dist/gates/checkpoint.js +109 -45
package/dist/gates/checkpoint.test.js +6 -3
package/dist/gates/dependency.d.ts +39 -0
package/dist/gates/dependency.js +212 -5
package/dist/gates/duplication-drift.d.ts +101 -6
package/dist/gates/duplication-drift.js +427 -33
package/dist/gates/logic-drift.d.ts +70 -0
package/dist/gates/logic-drift.js +280 -0
package/dist/gates/runner.js +29 -1
package/dist/gates/style-drift.d.ts +53 -0
package/dist/gates/style-drift.js +305 -0
package/dist/index.d.ts +4 -0
package/dist/index.js +4 -0
package/dist/inference/model-manager.js +5 -1
package/dist/inference/types.d.ts +6 -1
package/dist/inference/types.js +6 -1
package/dist/services/adaptive-thresholds.d.ts +54 -10
package/dist/services/adaptive-thresholds.js +161 -35
package/dist/services/adaptive-thresholds.test.js +24 -20
package/dist/services/filesystem-cache.d.ts +50 -0
package/dist/services/filesystem-cache.js +124 -0
package/dist/services/temporal-drift.d.ts +101 -0
package/dist/services/temporal-drift.js +386 -0
package/dist/templates/universal-config.js +17 -0
package/dist/types/index.d.ts +196 -0
package/dist/types/index.js +19 -0
package/dist/utils/scanner.d.ts +6 -1
package/dist/utils/scanner.js +8 -1
package/package.json +6 -6

package/README.md CHANGED Viewed

@@ -3,23 +3,46 @@
 [![npm version](https://img.shields.io/npm/v/@rigour-labs/core?color=cyan)](https://www.npmjs.com/package/@rigour-labs/core)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-**Deterministic quality gate engine for AI-generated code.**
+**AI Agent Governance Engine — deterministic quality gates, drift detection, and LLM-powered deep analysis.**
-The core library powering [Rigour](https://rigour.run) — AST analysis, AI drift detection, security scanning, and Fix Packet generation across TypeScript, JavaScript, Python, Go, Ruby, and C#/.NET.
+The core library powering [Rigour](https://rigour.run) — 27+ quality gates, five-signal deep analysis pipeline, temporal drift engine, and AI agent DLP across TypeScript, JavaScript, Python, Go, Ruby, and C#/.NET.
 > This package is the engine. For the CLI, use [`@rigour-labs/cli`](https://www.npmjs.com/package/@rigour-labs/cli). For MCP integration, use [`@rigour-labs/mcp`](https://www.npmjs.com/package/@rigour-labs/mcp).
 ## What's Inside
-### 23 Quality Gates
+### 27+ Deterministic Quality Gates
 **Structural:** File size, cyclomatic complexity, method count, parameter count, nesting depth, required docs, content hygiene.
-**Security:** Hardcoded secrets, SQL injection, XSS, command injection, path traversal.
+**Security:** Hardcoded secrets, SQL injection, XSS, command injection, path traversal, frontend secret exposure.
-**AI-Native Drift Detection:** Duplication drift, hallucinated imports, inconsistent error handling, context window artifacts, async & error safety (promise safety).
+**AI Drift Detection:**
+- **Three-pass duplication drift** — MD5 exact → AST Jaccard (tree-sitter) → semantic embedding (all-MiniLM-L6-v2, 384D cosine). Catches `.find()` vs `.filter()[0]` — same intent, different implementation.
+- **Hallucinated imports** — language-aware resolution for relative + package imports.
+- **Phantom APIs** — non-existent stdlib/framework methods the LLM invented.
+- **Style drift** — fingerprints naming, error handling, import style, quote preferences against project baseline.
+- **Logic drift** — tracks comparison operators (>= → >), branch counts, return statements per function across scans.
+- **Dependency bloat** — unused deps, heavy alternatives (moment→dayjs), duplicate purpose packages.
+- **Context-window artifacts**, inconsistent error handling, promise safety, deprecated APIs.
-**Agent Governance:** Multi-agent scope isolation, checkpoint supervision, context drift, retry loop breaker.
+**Agent Governance:** Multi-agent scope isolation, EWMA-based checkpoint supervision, context drift, retry loop breaker, memory & skills governance with DLP scanning.
+### Five-Signal Deep Analysis Pipeline
+Rigour's deep analysis is not a wrapper around a generic LLM. The model operates within a cage of deterministic facts:
+1. **Extract** — five independent signal streams (AST facts, semantic embeddings, style fingerprints, logic baselines, dependency graphs) computed deterministically before the LLM sees anything.
+2. **Interpret** — the model receives structured facts (not raw source), focuses on SOLID, design patterns, language idioms, architecture. Constrained input prevents hallucination.
+3. **Verify** — every LLM finding is cross-referenced against all five signal streams. Wrong line numbers, phantom patterns, non-existent functions → discarded. Only verified findings with confidence scores reach the report.
+Both model tiers (lite sidecar + pro code-specialized) are fine-tuned via the [DriftBench RLAIF pipeline](https://github.com/rigour-labs/driftbench) where the five signal streams serve as the teacher signal.
+### Temporal Drift Engine (v5.1)
+Cross-session trend analysis powered by EWMA and Z-score anomaly detection. Tracks three independent provenance streams (AI drift, structural, security) with separate trend directions. Reads from the SQLite brain for month-over-month analysis.
+Key capabilities: per-provenance EWMA streams (alpha=0.3), Z-score anomaly detection (|Z| > 2.0), monthly/weekly rollups, semantic duplicate tracking, style + logic baseline evolution, human-readable narrative generation.
 ### Multi-Language Support
@@ -41,14 +64,27 @@ Machine-readable JSON diagnostics with severity, provenance, file, line number,
 ```typescript
 import { GateRunner } from '@rigour-labs/core';
-const runner = new GateRunner(config, projectRoot);
-const report = await runner.run();
+const runner = new GateRunner(config);
+const report = await runner.run(projectRoot);
-console.log(report.pass);      // true or false
-console.log(report.score);     // 0-100
+console.log(report.status);    // 'PASS' or 'FAIL'
+console.log(report.stats.score);     // 0-100
 console.log(report.failures);  // Failure[]
 ```
+### With Deep Analysis
+```typescript
+import { GateRunner } from '@rigour-labs/core';
+const runner = new GateRunner(config);
+const report = await runner.run(projectRoot, undefined, {
+  enabled: true,
+  pro: false,        // true for full-power model
+  provider: 'local', // or 'claude', 'openai', etc.
+});
+```
 ## Documentation
 **[Full docs at docs.rigour.run](https://docs.rigour.run)**

package/dist/gates/base.d.ts CHANGED Viewed

@@ -1,10 +1,13 @@
 import { GoldenRecord } from '../services/context-engine.js';
 import { Failure, Severity, Provenance } from '../types/index.js';
+import type { FileSystemCache } from '../services/filesystem-cache.js';
 export interface GateContext {
     cwd: string;
     record?: GoldenRecord;
     ignore?: string[];
     patterns?: string[];
+    /** Shared file cache across gates — reduces memory ~80% on large repos */
+    fileCache?: FileSystemCache;
 }
 export declare abstract class Gate {
     readonly id: string;

package/dist/gates/checkpoint.d.ts CHANGED Viewed

@@ -1,16 +1,25 @@
 /**
- * Checkpoint Supervision Gate
+ * Checkpoint Supervision Gate (v2)
  *
  * Monitors agent quality during extended execution for frontier models
  * like GPT-5.3-Codex "coworking mode" that run autonomously for long periods.
  *
- * Features:
- * - Time-based checkpoint triggers
- * - Quality score tracking
- * - Drift detection (quality degradation over time)
- * - Auto-save on failure
+ * v2 upgrades:
+ * - EWMA (Exponentially Weighted Moving Average) replaces linear regression
+ * - One bad checkpoint no longer tanks the whole trend
+ * - Configurable smoothing factor (α=0.3 default — recent data weighted more)
+ * - Separate "sudden drop" detection from "gradual decline" detection
  *
- * @since v2.14.0
+ * EWMA formula:
+ *   ewma_t = α × score_t + (1 - α) × ewma_(t-1)
+ *
+ * Why EWMA > Linear Regression:
+ * - Linear regression on 5 points: one outlier shifts the slope dramatically
+ * - EWMA: one bad score dampened by history, persistent drops amplified
+ * - α=0.3: ~70% weight on history, 30% on new data → noise-resistant
+ *
+ * @since v2.14.0 (original, linear regression)
+ * @since v5.0.0  (EWMA drift detection)
  */
 import { Gate, GateContext } from './base.js';
 import { Failure, Provenance } from '../types/index.js';
@@ -22,6 +31,8 @@ export interface CheckpointEntry {
     summary: string;
     qualityScore: number;
     warnings: string[];
+    /** EWMA value at this checkpoint (computed on record) */
+    ewma?: number;
 }
 export interface CheckpointSession {
     sessionId: string;
@@ -36,6 +47,10 @@ export interface CheckpointConfig {
     quality_threshold?: number;
     drift_detection?: boolean;
     auto_save_on_failure?: boolean;
+    /** EWMA smoothing factor. Higher = more weight on recent data. Default 0.3 */
+    ewma_alpha?: number;
+    /** Drop from EWMA that triggers drift warning. Default 15 */
+    drift_drop_threshold?: number;
 }
 /**
  * Get or create checkpoint session
@@ -44,7 +59,7 @@ export declare function getOrCreateCheckpointSession(cwd: string): CheckpointSes
 /**
  * Record a checkpoint with quality evaluation
  */
-export declare function recordCheckpoint(cwd: string, progressPct: number, filesChanged: string[], summary: string, qualityScore: number): {
+export declare function recordCheckpoint(cwd: string, progressPct: number, filesChanged: string[], summary: string, qualityScore: number, config?: CheckpointConfig): {
     continue: boolean;
     warnings: string[];
     checkpoint: CheckpointEntry;

package/dist/gates/checkpoint.js CHANGED Viewed

@@ -1,16 +1,25 @@
 /**
- * Checkpoint Supervision Gate
+ * Checkpoint Supervision Gate (v2)
  *
  * Monitors agent quality during extended execution for frontier models
  * like GPT-5.3-Codex "coworking mode" that run autonomously for long periods.
  *
- * Features:
- * - Time-based checkpoint triggers
- * - Quality score tracking
- * - Drift detection (quality degradation over time)
- * - Auto-save on failure
+ * v2 upgrades:
+ * - EWMA (Exponentially Weighted Moving Average) replaces linear regression
+ * - One bad checkpoint no longer tanks the whole trend
+ * - Configurable smoothing factor (α=0.3 default — recent data weighted more)
+ * - Separate "sudden drop" detection from "gradual decline" detection
  *
- * @since v2.14.0
+ * EWMA formula:
+ *   ewma_t = α × score_t + (1 - α) × ewma_(t-1)
+ *
+ * Why EWMA > Linear Regression:
+ * - Linear regression on 5 points: one outlier shifts the slope dramatically
+ * - EWMA: one bad score dampened by history, persistent drops amplified
+ * - α=0.3: ~70% weight on history, 30% on new data → noise-resistant
+ *
+ * @since v2.14.0 (original, linear regression)
+ * @since v5.0.0  (EWMA drift detection)
  */
 import { Gate } from './base.js';
 import { Logger } from '../utils/logger.js';
@@ -22,8 +31,78 @@ let currentCheckpointSession = null;
  * Generate unique checkpoint ID
  */
 function generateCheckpointId() {
-    return `cp-${Date.now()}-${Math.random().toString(36).substr(2, 6)}`;
+    return `cp-${Date.now()}-${Math.random().toString(36).substring(2, 8)}`;
+}
+// ─── EWMA Computation ──────────────────────────────────────────────
+/**
+ * Compute EWMA for a new data point.
+ *
+ * ewma_t = α × value + (1 - α) × ewma_(t-1)
+ *
+ * For the first data point, ewma = value (no history to smooth against).
+ */
+function computeEWMA(value, previousEWMA, alpha) {
+    if (previousEWMA === undefined)
+        return value;
+    return alpha * value + (1 - alpha) * previousEWMA;
+}
+/**
+ * Detect quality drift using EWMA.
+ *
+ * Two-signal detection:
+ *
+ * 1. Sudden Drop: Current score is significantly below the EWMA.
+ *    This catches "agent suddenly started producing garbage."
+ *    Threshold: score < ewma - drift_drop_threshold
+ *
+ * 2. Gradual Decline: EWMA itself is trending downward.
+ *    Compare current EWMA to EWMA from N checkpoints ago.
+ *    This catches "agent is slowly getting worse over 30 minutes."
+ *    Threshold: ewma_now < ewma_5ago - drift_drop_threshold
+ *
+ * Returns trend assessment and whether drift alarm should fire.
+ */
+function detectDrift(checkpoints, alpha, dropThreshold) {
+    if (checkpoints.length < 3) {
+        return { hasDrift: false, trend: 'stable', ewma: checkpoints.length > 0 ? checkpoints[checkpoints.length - 1].qualityScore : 0 };
+    }
+    // Recompute EWMA chain to ensure consistency
+    let ewma = checkpoints[0].qualityScore;
+    for (let i = 1; i < checkpoints.length; i++) {
+        ewma = computeEWMA(checkpoints[i].qualityScore, ewma, alpha);
+    }
+    const currentScore = checkpoints[checkpoints.length - 1].qualityScore;
+    // Signal 1: Sudden drop from smoothed average
+    if (currentScore < ewma - dropThreshold) {
+        return {
+            hasDrift: true,
+            trend: 'degrading',
+            ewma,
+            reason: `Sudden quality drop: score ${currentScore}% vs EWMA ${ewma.toFixed(1)}% (gap: ${(ewma - currentScore).toFixed(1)})`,
+        };
+    }
+    // Signal 2: Gradual EWMA decline (compare to EWMA 5 checkpoints ago)
+    if (checkpoints.length >= 6) {
+        let ewmaAt5Ago = checkpoints[0].qualityScore;
+        const stopAt = checkpoints.length - 5;
+        for (let i = 1; i <= stopAt; i++) {
+            ewmaAt5Ago = computeEWMA(checkpoints[i].qualityScore, ewmaAt5Ago, alpha);
+        }
+        if (ewma < ewmaAt5Ago - dropThreshold) {
+            return {
+                hasDrift: true,
+                trend: 'degrading',
+                ewma,
+                reason: `Gradual decline: EWMA dropped from ${ewmaAt5Ago.toFixed(1)}% to ${ewma.toFixed(1)}% over last 5 checkpoints`,
+            };
+        }
+        if (ewma > ewmaAt5Ago + dropThreshold) {
+            return { hasDrift: false, trend: 'improving', ewma };
+        }
+    }
+    return { hasDrift: false, trend: 'stable', ewma };
 }
+// ─── Session Management ─────────────────────────────────────────────
 /**
  * Get or create checkpoint session
  */
@@ -45,22 +124,29 @@ export function getOrCreateCheckpointSession(cwd) {
 /**
  * Record a checkpoint with quality evaluation
  */
-export function recordCheckpoint(cwd, progressPct, filesChanged, summary, qualityScore) {
+export function recordCheckpoint(cwd, progressPct, filesChanged, summary, qualityScore, config) {
     const session = getOrCreateCheckpointSession(cwd);
     const warnings = [];
-    // Default threshold
-    const qualityThreshold = 80;
+    const alpha = config?.ewma_alpha ?? 0.3;
+    const qualityThreshold = config?.quality_threshold ?? 80;
     // Check if quality is below threshold
     const shouldContinue = qualityScore >= qualityThreshold;
     if (!shouldContinue) {
         warnings.push(`Quality score ${qualityScore}% is below threshold ${qualityThreshold}%`);
     }
-    // Detect drift (quality degradation over recent checkpoints)
+    // Compute EWMA for this checkpoint
+    const previousEWMA = session.checkpoints.length > 0
+        ? session.checkpoints[session.checkpoints.length - 1].ewma
+        : undefined;
+    const currentEWMA = computeEWMA(qualityScore, previousEWMA, alpha);
+    // Detect drift using EWMA
     if (session.checkpoints.length >= 2) {
-        const recentScores = session.checkpoints.slice(-3).map(cp => cp.qualityScore);
-        const avgRecent = recentScores.reduce((a, b) => a + b, 0) / recentScores.length;
-        if (qualityScore < avgRecent - 10) {
-            warnings.push(`Drift detected: quality dropped from avg ${avgRecent.toFixed(0)}% to ${qualityScore}%`);
+        const dropThreshold = config?.drift_drop_threshold ?? 15;
+        // Temporarily add this checkpoint for drift analysis
+        const tempCheckpoints = [...session.checkpoints, { qualityScore }];
+        const { hasDrift, reason } = detectDrift(tempCheckpoints, alpha, dropThreshold);
+        if (hasDrift && reason) {
+            warnings.push(reason);
         }
     }
     const checkpoint = {
@@ -71,6 +157,7 @@ export function recordCheckpoint(cwd, progressPct, filesChanged, summary, qualit
         summary,
         qualityScore,
         warnings,
+        ewma: Math.round(currentEWMA * 10) / 10,
     };
     session.checkpoints.push(checkpoint);
     session.lastCheckpoint = new Date();
@@ -101,7 +188,6 @@ export function completeCheckpointSession(cwd) {
 export function abortCheckpointSession(cwd, reason) {
     if (currentCheckpointSession) {
         currentCheckpointSession.status = 'aborted';
-        // Add final checkpoint with abort reason
         currentCheckpointSession.checkpoints.push({
             checkpointId: generateCheckpointId(),
             timestamp: new Date(),
@@ -162,30 +248,7 @@ function timeSinceLastCheckpoint(session) {
     const lastTime = session.lastCheckpoint || session.startedAt;
     return (Date.now() - lastTime.getTime()) / 1000 / 60; // minutes
 }
-/**
- * Detect quality drift pattern
- */
-function detectDrift(checkpoints) {
-    if (checkpoints.length < 3) {
-        return { hasDrift: false, trend: 'stable' };
-    }
-    const recent = checkpoints.slice(-5);
-    const scores = recent.map(cp => cp.qualityScore);
-    // Calculate trend using simple linear regression
-    const n = scores.length;
-    const sumX = (n * (n - 1)) / 2;
-    const sumY = scores.reduce((a, b) => a + b, 0);
-    const sumXY = scores.reduce((sum, y, x) => sum + x * y, 0);
-    const sumX2 = (n * (n - 1) * (2 * n - 1)) / 6;
-    const slope = (n * sumXY - sumX * sumY) / (n * sumX2 - sumX * sumX);
-    if (slope < -2) {
-        return { hasDrift: true, trend: 'degrading' };
-    }
-    else if (slope > 2) {
-        return { hasDrift: false, trend: 'improving' };
-    }
-    return { hasDrift: false, trend: 'stable' };
-}
+// ─── Gate Implementation ────────────────────────────────────────────
 export class CheckpointGate extends Gate {
     config;
     constructor(config = {}) {
@@ -196,6 +259,8 @@ export class CheckpointGate extends Gate {
             quality_threshold: config.quality_threshold ?? 80,
             drift_detection: config.drift_detection ?? true,
             auto_save_on_failure: config.auto_save_on_failure ?? true,
+            ewma_alpha: config.ewma_alpha ?? 0.3,
+            drift_drop_threshold: config.drift_drop_threshold ?? 15,
         };
     }
     get provenance() { return 'governance'; }
@@ -206,7 +271,6 @@ export class CheckpointGate extends Gate {
         const failures = [];
         const session = getCheckpointSession(context.cwd);
         if (!session || session.checkpoints.length === 0) {
-            // No checkpoints yet, skip
             return [];
         }
         Logger.info(`Checkpoint Gate: ${session.checkpoints.length} checkpoints in session`);
@@ -220,11 +284,11 @@ export class CheckpointGate extends Gate {
         if (lastCheckpoint.qualityScore < (this.config.quality_threshold ?? 80)) {
             failures.push(this.createFailure(`Quality score ${lastCheckpoint.qualityScore}% is below threshold ${this.config.quality_threshold}%`, lastCheckpoint.filesChanged, 'Review recent changes and address quality issues before continuing', 'Quality Below Threshold', undefined, undefined, 'high'));
         }
-        // Check 3: Drift detection
+        // Check 3: EWMA-based drift detection (v5)
         if (this.config.drift_detection) {
-            const { hasDrift, trend } = detectDrift(session.checkpoints);
+            const { hasDrift, trend, ewma, reason } = detectDrift(session.checkpoints, this.config.ewma_alpha ?? 0.3, this.config.drift_drop_threshold ?? 15);
             if (hasDrift && trend === 'degrading') {
-                failures.push(this.createFailure(`Quality drift detected: scores are degrading over time`, undefined, 'Agent performance is declining. Consider pausing and reviewing recent work.', 'Quality Drift Detected', undefined, undefined, 'high'));
+                failures.push(this.createFailure(`Quality drift detected (EWMA: ${ewma.toFixed(1)}%): ${reason || 'scores are degrading over time'}`, undefined, 'Agent performance is declining. Consider pausing and reviewing recent work.', 'Quality Drift Detected', undefined, undefined, 'high'));
             }
         }
         return failures;

package/dist/gates/checkpoint.test.js CHANGED Viewed

@@ -72,11 +72,14 @@ describe('CheckpointGate', () => {
     });
     describe('drift detection', () => {
         it('should detect quality degradation', () => {
-            // Record several checkpoints with declining quality
+            // Record checkpoints with a sharp quality drop.
+            // EWMA with α=0.3: after 95, 90, the EWMA ≈ 93.5.
+            // A sudden drop to 55 creates a gap of ~38 which exceeds the
+            // drift_drop_threshold of 15, triggering the "Sudden quality drop" warning.
             recordCheckpoint(testDir, 20, [], 'Start', 95);
             recordCheckpoint(testDir, 40, [], 'Middle', 90);
-            const result = recordCheckpoint(testDir, 60, [], 'Decline', 75);
-            expect(result.warnings.some(w => w.includes('Drift detected'))).toBe(true);
+            const result = recordCheckpoint(testDir, 60, [], 'Decline', 55);
+            expect(result.warnings.some(w => w.includes('Sudden quality drop'))).toBe(true);
         });
         it('should not flag stable quality', () => {
             recordCheckpoint(testDir, 20, [], 'Start', 85);

package/dist/gates/dependency.d.ts CHANGED Viewed

@@ -1,7 +1,46 @@
+/**
+ * Dependency Guardian Gate (v2)
+ *
+ * Detects dependency issues that AI agents commonly introduce:
+ * 1. Forbidden dependencies (existing) — packages banned by project standards
+ * 2. Unused dependencies (NEW) — installed but never imported
+ * 3. Heavy alternatives (NEW) — bloated packages with lighter alternatives
+ * 4. Duplicate purpose (NEW) — multiple packages solving the same problem
+ *
+ * AI agents are particularly prone to:
+ * - Adding packages they've seen in training data without checking existing deps
+ * - Using heavy/popular packages when lighter alternatives exist
+ * - Installing multiple HTTP clients, date libs, etc. across different sessions
+ *
+ * @since v2.0.0 (forbidden deps)
+ * @since v5.1.0 (unused, heavy alternatives, duplicate purpose)
+ */
 import { Failure, Config } from '../types/index.js';
 import { Gate, GateContext } from './base.js';
 export declare class DependencyGate extends Gate {
     private config;
     constructor(config: Config);
     run(context: GateContext): Promise<Failure[]>;
+    /**
+     * Detect dependencies listed in package.json but never imported.
+     * Scans all source files for import/require statements.
+     *
+     * Allowlist handles side-effect imports like:
+     * - @types/* (TypeScript type packages)
+     * - polyfills (core-js, regenerator-runtime)
+     * - PostCSS/Babel plugins (used in config files, not source)
+     */
+    private detectUnusedDeps;
+    /**
+     * Detect heavy/bloated packages that have lighter modern alternatives.
+     * AI agents tend to reach for the most popular (heaviest) package
+     * because that's what they've seen most in training data.
+     */
+    private detectHeavyAlternatives;
+    /**
+     * Detect when multiple packages serve the same purpose.
+     * This is a classic AI drift symptom — different sessions install different
+     * packages for the same task (e.g., axios in one PR, got in another).
+     */
+    private detectDuplicatePurpose;
 }