npm - agentic-qe - Versions diffs - 3.7.7 → 3.7.8 - Mend

agentic-qe 3.7.7 → 3.7.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (113) hide show

package/.claude/helpers/statusline-v3.cjs CHANGED Viewed

@@ -308,7 +308,19 @@ function getLearningMetrics(projectDir) {
   // V3 trajectories (new V3 trajectory tracking)
   const trajectories = parseInt(sqlite3Query(dbPath, 'SELECT COUNT(*) FROM qe_trajectories')) || 0;
   // Captured experiences (task execution captures)
-  const capturedExp = parseInt(sqlite3Query(dbPath, 'SELECT COUNT(*) FROM captured_experiences')) || 0;
+  // Use SUM(consolidation_count) for monotonically non-decreasing counter:
+  // - New experience → +1
+  // - Merge A into B → A excluded (consolidated_into set), B's count += A's count → net 0
+  // - Archive → row stays, still counted → net 0
+  // Falls back to COUNT(*) if consolidation columns not yet added
+  let capturedExp = 0;
+  const consolidatedQuery = sqlite3Query(dbPath,
+    "SELECT COALESCE(SUM(consolidation_count), COUNT(*)) FROM captured_experiences WHERE consolidated_into IS NULL OR consolidated_into = 'archived'", '__FAIL__');
+  if (consolidatedQuery !== '__FAIL__') {
+    capturedExp = parseInt(consolidatedQuery) || 0;
+  } else {
+    capturedExp = parseInt(sqlite3Query(dbPath, 'SELECT COUNT(*) FROM captured_experiences')) || 0;
+  }
   // Memory entries with learning data (MCP-stored experiences)
   const memoryLearning = parseInt(sqlite3Query(dbPath, "SELECT COUNT(*) FROM memory_entries WHERE key LIKE 'learning%' OR key LIKE 'phase2/learning%'")) || 0;
   // QE pattern usage (hook-recorded outcomes from aqe hooks post-task/post-edit)

package/.claude/skills/skills-manifest.json CHANGED Viewed

@@ -904,7 +904,7 @@
   },
   "metadata": {
     "generatedBy": "Agentic QE Fleet",
-    "fleetVersion": "3.7.7",
+    "fleetVersion": "3.7.8",
     "manifestVersion": "1.3.0",
     "lastUpdated": "2026-02-04T00:00:00.000Z",
     "contributors": [

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,25 @@ All notable changes to the Agentic QE project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [3.7.8] - 2026-03-04
+### Added
+- **Loki-Mode adversarial quality gates (ADR-074)** — 7 new features to catch sycophantic AI outputs, hollow tests, and routing drift. All enabled by default (opt-out via config flags):
+  - **Anti-sycophancy scorer**: Detects rubber-stamp consensus via Jaccard similarity, confidence uniformity, and reasoning overlap across model votes
+  - **Test quality gates**: Catches tautological assertions (`expect(true).toBe(true)`), empty test bodies, and missing source imports in generated tests
+  - **Blind review orchestrator**: Runs N parallel test generators with varied temperatures, deduplicates results via Jaccard similarity
+  - **EMA calibration**: Exponential moving average tracks per-agent accuracy and derives dynamic voting weights, with SQLite state persistence
+  - **Edge-case injection**: Queries historical patterns from the learning store and injects proven edge cases into test generation prompts
+  - **Complexity-driven team composition**: Maps 8-dimension complexity analysis (AST + security + concurrency + API surface) to agent team composition
+  - **Auto-escalation tracker**: Consecutive failures auto-promote agent tier; consecutive successes auto-demote for cost optimization
+- **Smart experience consolidation** — Replace destructive pruning with intelligent consolidation that preserves high-value learning patterns while managing memory growth
+- **Multi-language test generation plan** — Architecture decision records (ADR-075 through ADR-079) for unified test framework type system, Tree-sitter WASM parser, compilation validation loop, backward-compatible API, and language-specific path resolution
+### Changed
+- **Loki-mode features enabled by default** — All 6 config flags (`enableSycophancyCheck`, `enableTestQualityGate`, `enableEdgeCaseInjection`, `enableEMACalibration`, `enableAutoEscalation`, `enableComplexityComposition`) default to `true` for immediate quality improvement
 ## [3.7.7] - 2026-03-02
 ### Added

package/README.md CHANGED Viewed

@@ -94,6 +94,7 @@ For client-specific setup examples, see [Platform Setup Guide](docs/platform-set
 - ✅ **OpenCode Support** (v3.7.1): 59 agent configs, 86 skill configs (78 QE + 8 general dev), 5 tool wrappers, SSE/WS/HTTP transport, output compaction, graceful degradation, `aqe init --with-opencode` auto-provisioning
 - ✅ **AWS Kiro Support** (v3.7.2): 87 agent configs, 86 skill configs, 5 event-driven hooks, 2 steering files, MCP config, `aqe init --with-kiro` auto-provisioning
 - ✅ **Multi-Platform Support** (v3.7.4): 8 new platform integrations — GitHub Copilot, Cursor, Cline, Kilo Code, Roo Code, OpenAI Codex CLI, Windsurf, Continue.dev — with JSON/TOML/YAML config generation, behavioral rules, and `aqe platform list/setup/verify` CLI
+- ✅ **Loki-Mode Quality Gates** (v3.7.7): Anti-sycophancy scoring, test quality gates, blind review, EMA calibration, edge-case injection, complexity-driven composition, auto-escalation — enabled by default (opt-out)
 - ✅ **V2 Backward Compatibility**: All V2 agents map to V3 equivalents
 - ✅ **78 QE Skills**: 46 Tier 3 verified + 32 additional QE skills (QCSD swarms, n8n testing, enterprise integration, qe-* domains)
@@ -299,6 +300,41 @@ aqe learning stats
 ---
+### 🛡️ Loki-Mode Quality Gates (v3.7.7)
+V3.7.7 adds **7 adversarial quality features** inspired by loki-mode — designed to catch sycophantic AI outputs, hollow tests, and routing drift. All features are **enabled by default** (opt-out via config flags).
+| Feature | Config Flag | Description |
+|---------|------------|-------------|
+| **Anti-Sycophancy Scorer** | `enableSycophancyCheck` | Detects rubber-stamp consensus via Jaccard similarity, confidence uniformity, and reasoning overlap |
+| **Test Quality Gates** | `enableTestQualityGate` | Catches tautological assertions (`expect(true).toBe(true)`), empty test bodies, and missing source imports |
+| **Blind Review** | N/A (API option) | Runs N parallel test generators with varied temperatures, deduplicates via Jaccard |
+| **EMA Calibration** | `enableEMACalibration` | Exponential moving average tracks per-agent accuracy, derives dynamic voting weights |
+| **Edge-Case Injection** | `enableEdgeCaseInjection` | Queries historical patterns and injects proven edge cases into test generation prompts |
+| **Complexity Composition** | `enableComplexityComposition` | Maps 8-dimension complexity (AST + security + concurrency) to agent team composition |
+| **Auto-Escalation** | `enableAutoEscalation` | Consecutive failures auto-promote agent tier; consecutive successes auto-demote |
+```typescript
+// All features are ON by default. To disable specific features:
+const config: Partial<RoutingConfig> = {
+  enableEMACalibration: false,    // Disable EMA voting weights
+  enableAutoEscalation: false,    // Disable auto tier promotion
+};
+const consensusConfig: Partial<ConsensusEngineConfig> = {
+  enableSycophancyCheck: false,   // Disable rubber-stamp detection
+};
+const testConfig: Partial<TestGeneratorConfig> = {
+  enableTestQualityGate: false,   // Disable tautology detection
+  enableEdgeCaseInjection: false, // Disable pattern injection
+};
+```
+> See [docs/loki-mode-features.md](docs/loki-mode-features.md) for detailed usage examples and configuration reference.
+---
 ### 🌙 Dream Cycles & Neural Learning
 V3 introduces **Dream cycles** for neural consolidation and continuous improvement: