npm - agentic-qe - Versions diffs - 3.7.18 → 3.7.20 - Mend

agentic-qe 3.7.18 → 3.7.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

package/.claude/skills/iterative-loop/SKILL.md +371 -0
package/.claude/skills/skills-manifest.json +36 -8
package/.claude/skills/validation-pipeline/SKILL.md +164 -0
package/.claude/skills/validation-pipeline/evals/validation-pipeline.yaml +544 -0
package/.claude/skills/validation-pipeline/schemas/output.json +193 -0
package/.claude/skills/validation-pipeline/scripts/validate-config.json +34 -0
package/CHANGELOG.md +7 -0
package/README.md +5 -3
package/assets/skills/skills-manifest.json +17 -1
package/assets/skills/validation-pipeline/SKILL.md +164 -0
package/assets/skills/validation-pipeline/evals/validation-pipeline.yaml +544 -0
package/assets/skills/validation-pipeline/schemas/output.json +193 -0
package/assets/skills/validation-pipeline/scripts/validate-config.json +34 -0
package/dist/cli/bundle.js +9 -7
package/dist/context/compiler.js +4 -0
package/dist/context/index.d.ts +2 -0
package/dist/context/index.js +2 -0
package/dist/context/sources/defect-source.d.ts +17 -0
package/dist/context/sources/defect-source.js +102 -0
package/dist/context/sources/index.d.ts +2 -0
package/dist/context/sources/index.js +2 -0
package/dist/context/sources/requirements-source.d.ts +17 -0
package/dist/context/sources/requirements-source.js +119 -0
package/dist/coordination/task-executor.js +7 -1
package/dist/coordination/yaml-pipeline-loader.d.ts +32 -0
package/dist/coordination/yaml-pipeline-loader.js +389 -0
package/dist/coordination/yaml-pipeline-registry.d.ts +61 -0
package/dist/coordination/yaml-pipeline-registry.js +143 -0
package/dist/governance/continue-gate-integration.js +1 -1
package/dist/governance/feature-flags.js +2 -2
package/dist/init/settings-merge.js +2 -0
package/dist/mcp/bundle.js +8674 -1248
package/dist/mcp/entry.js +21 -0
package/dist/mcp/handlers/domain-handler-configs.js +11 -0
package/dist/mcp/handlers/index.d.ts +2 -0
package/dist/mcp/handlers/index.js +4 -0
package/dist/mcp/handlers/pipeline-handlers.d.ts +75 -0
package/dist/mcp/handlers/pipeline-handlers.js +208 -0
package/dist/mcp/handlers/validation-pipeline-handler.d.ts +53 -0
package/dist/mcp/handlers/validation-pipeline-handler.js +118 -0
package/dist/mcp/protocol-server.js +167 -1
package/dist/mcp/server.js +75 -1
package/dist/workers/daemon.js +3 -2
package/dist/workers/index.d.ts +6 -0
package/dist/workers/index.js +6 -0
package/dist/workers/workers/heartbeat-scheduler.d.ts +45 -0
package/dist/workers/workers/heartbeat-scheduler.js +312 -0
package/dist/workers/workers/index.d.ts +2 -1
package/dist/workers/workers/index.js +2 -1
package/package.json +1 -1

package/.claude/skills/iterative-loop/SKILL.md ADDED Viewed

@@ -0,0 +1,371 @@
+---
+name: "Iterative Loop"
+description: "Implement continuous AI iteration loops for complex development tasks. Use when building features requiring test-driven refinement, implementing tasks with clear success criteria, or automating iterative improvement workflows. Based on the Ralph Wiggum technique from Claude Code plugins."
+---
+# Iterative Loop
+## Overview
+The Iterative Loop skill implements **continuous AI-driven development loops** that persist until completion criteria are met. Inspired by the Ralph Wiggum technique, this approach enables autonomous, self-correcting development cycles where the AI sees its previous work in files and git history, iteratively improving until success.
+## Core Philosophy
+1. **Iteration > Perfection** - Don't aim for perfect on first try; let the loop refine the work
+2. **Failures Are Data** - Each failure provides information to improve the next attempt
+3. **Clear Criteria** - Success must be objectively measurable (tests, metrics, validations)
+4. **Persistence Wins** - Keep trying until success; the loop handles retry logic automatically
+## Prerequisites
+- Claude Code with session management
+- Clear completion criteria (tests, linting, metrics)
+- Version control (git) for tracking iterations
+---
+## Quick Start
+### Basic Iterative Development Pattern
+```bash
+# Define task with clear completion criteria
+TASK="Implement user authentication with JWT.
+Success criteria:
+- All unit tests pass
+- Integration tests pass
+- No TypeScript errors
+- Security audit passes
+Output <promise>COMPLETE</promise> when all criteria met."
+# Execute iterative loop (conceptual)
+while ! task_complete; do
+  claude_execute "$TASK"
+  check_completion_criteria
+done
+```
+### AQE v3 Integration Example
+```bash
+# Using claude-flow hooks for iterative task
+npx @claude-flow/cli@latest hooks pre-task --description "Implement auth with iteration" --taskId "auth-impl"
+# Store iteration state in memory
+npx @claude-flow/cli@latest memory store \
+  --key "iteration-auth" \
+  --value '{"iteration": 1, "maxIterations": 20, "criteria": "all tests pass"}' \
+  --namespace iterations
+```
+---
+## Step-by-Step Guide
+### Step 1: Define Clear Success Criteria
+**Essential**: Every iterative task MUST have objectively measurable completion criteria.
+**Good Criteria Examples:**
+```markdown
+✅ All unit tests pass (npm test returns exit code 0)
+✅ Coverage > 80% (coverage report shows 80%+)
+✅ No TypeScript errors (tsc --noEmit returns 0)
+✅ Linting passes (eslint returns 0)
+✅ Performance < 100ms (benchmark shows < 100ms)
+```
+**Bad Criteria Examples:**
+```markdown
+❌ "Code looks good" (subjective)
+❌ "Works properly" (undefined)
+❌ "Well-structured" (no measurable check)
+```
+### Step 2: Structure the Task with Phases
+Break complex tasks into incremental phases:
+```markdown
+## Task: Implement User Authentication
+### Phase 1: Data Layer
+- Create User model with Prisma schema
+- Write migration
+- Run tests: `npm test -- --grep "User model"`
+- Criteria: Model tests pass
+### Phase 2: Service Layer
+- Implement AuthService with JWT
+- Add token generation/validation
+- Run tests: `npm test -- --grep "AuthService"`
+- Criteria: Service tests pass
+### Phase 3: API Layer
+- Create /auth/login endpoint
+- Create /auth/register endpoint
+- Run tests: `npm test -- --grep "auth API"`
+- Criteria: API tests pass
+### Phase 4: Integration
+- End-to-end authentication flow
+- Run tests: `npm test`
+- Criteria: ALL tests pass
+Output <promise>AUTH_COMPLETE</promise> when Phase 4 passes.
+```
+### Step 3: Implement Safety Mechanisms
+Always include escape conditions:
+```markdown
+## Safety Rules
+1. **Max Iterations**: Stop after 20 attempts
+2. **Stuck Detection**: After 5 iterations without progress:
+   - Document what's blocking
+   - List attempted approaches
+   - Suggest alternative strategies
+3. **Critical Errors**: Stop immediately if:
+   - Database corruption detected
+   - Security vulnerability introduced
+   - Breaking changes to existing features
+```
+### Step 4: Execute with Verification
+Each iteration should:
+1. Make targeted changes
+2. Run verification (tests, lint, build)
+3. Analyze results
+4. Plan next iteration based on feedback
+```bash
+# Iteration pattern
+1. Read previous state (files, git log)
+2. Identify remaining work
+3. Implement specific change
+4. Run verification suite
+5. If all pass -> output completion promise
+6. If failures -> analyze and continue iteration
+```
+---
+## Iterative Patterns
+### Pattern 1: Test-Driven Iteration
+```markdown
+## TDD Iteration Task
+1. Write failing test for [feature]
+2. Implement minimal code to pass test
+3. Run `npm test`
+4. If test fails -> debug and fix implementation
+5. If test passes -> check if more tests needed
+6. Repeat until all acceptance tests pass
+7. Refactor if needed
+8. Output <promise>TDD_COMPLETE</promise>
+```
+### Pattern 2: Bug Fix Iteration
+```markdown
+## Bug Fix Task
+1. Write failing test that reproduces bug
+2. Implement fix
+3. Run test suite
+4. If reproduction test fails -> analyze why fix didn't work
+5. If other tests fail -> fix regressions
+6. If all tests pass -> output <promise>BUG_FIXED</promise>
+Max iterations: 10
+After 5 iterations without fix:
+- Document root cause analysis
+- Suggest alternative approaches
+```
+### Pattern 3: Coverage Improvement Iteration
+```markdown
+## Coverage Improvement Task
+Target: 80% line coverage
+1. Run coverage analysis
+2. Identify uncovered code paths
+3. Write test for highest-impact uncovered path
+4. Run tests with coverage
+5. If coverage >= 80% -> output <promise>COVERAGE_ACHIEVED</promise>
+6. If coverage < 80% -> continue iteration
+Max iterations: 30
+Progress check: If coverage doesn't improve for 3 iterations -> analyze blockers
+```
+### Pattern 4: Performance Optimization Iteration
+```markdown
+## Performance Optimization Task
+Target: Response time < 100ms
+1. Run performance benchmark
+2. Identify slowest operation
+3. Implement optimization
+4. Run benchmark again
+5. If target met -> output <promise>PERF_TARGET_MET</promise>
+6. If not improved -> try different approach
+Max iterations: 15
+Record metrics each iteration for trend analysis
+```
+---
+## Integration with Claude Flow
+### Memory-Enhanced Iteration
+```bash
+# Store iteration state
+npx @claude-flow/cli@latest memory store \
+  --key "current-iteration" \
+  --value '{"task": "auth", "iteration": 5, "lastResult": "2 tests failing"}' \
+  --namespace iterations
+# Search for similar past iterations
+npx @claude-flow/cli@latest memory search \
+  --query "auth implementation" \
+  --namespace iterations
+# Learn from successful completions
+npx @claude-flow/cli@latest hooks post-task \
+  --taskId "auth-impl" \
+  --success true \
+  --quality 0.9
+```
+### Swarm-Coordinated Iteration
+For complex tasks, use multiple agents iterating in parallel:
+```bash
+# Initialize swarm for parallel iteration
+npx @claude-flow/cli@latest swarm init --topology mesh --max-agents 5
+# Spawn specialized iterators
+Task("Iterate on unit tests", "Fix failing unit tests until all pass", "tester")
+Task("Iterate on integration", "Fix integration tests until all pass", "tester")
+Task("Iterate on performance", "Optimize until benchmarks pass", "performance-engineer")
+```
+---
+## Best Practices
+### Prompt Engineering for Iteration
+**Include:**
+- Explicit completion criteria with verification commands
+- Phase-based breakdown for complex tasks
+- Safety limits (max iterations)
+- Progress tracking instructions
+- Stuck detection and recovery procedures
+**Example Well-Structured Prompt:**
+```markdown
+## Task: Implement Feature X
+### Success Criteria (ALL must pass):
+1. `npm test` exits with code 0
+2. `npm run lint` exits with code 0
+3. `npm run typecheck` exits with code 0
+4. No console.log statements in production code
+### Phases:
+1. Write failing tests
+2. Implement feature
+3. Fix any failures
+4. Clean up and refactor
+### Safety:
+- Max iterations: 20
+- After 10 iterations: summarize blockers
+- Stop if security issues detected
+### Completion:
+When ALL success criteria pass, output:
+<promise>FEATURE_X_COMPLETE</promise>
+```
+### When to Use Iterative Loops
+**Ideal for:**
+- Well-defined tasks with measurable success
+- Test-driven development
+- Bug fixing with reproducible tests
+- Coverage improvement
+- Performance optimization
+- Linting/formatting fixes
+**Not ideal for:**
+- Tasks requiring human judgment
+- Design decisions
+- Vague or subjective goals
+- One-time operations
+- Production debugging without tests
+---
+## Troubleshooting
+### Issue: Infinite Loop / No Progress
+**Symptoms**: Same errors repeat without improvement
+**Solutions**:
+1. Increase specificity in completion criteria
+2. Add "stuck detection" with alternative approaches
+3. Lower max iterations
+4. Break task into smaller phases
+### Issue: False Completion
+**Symptoms**: Loop ends but task not actually complete
+**Solutions**:
+1. Add more verification commands
+2. Make completion criteria more explicit
+3. Add integration tests alongside unit tests
+### Issue: Regression in Later Iterations
+**Symptoms**: Previously passing tests fail after new changes
+**Solutions**:
+1. Add regression check step
+2. Use git to compare iterations
+3. Implement smaller, targeted changes
+---
+## Related Skills
+- [tdd-london-chicago](../tdd-london-chicago/) - TDD approaches for iterative development
+- [qe-iterative-loop](../qe-iterative-loop/) - AQE v3 fleet-specific iteration patterns
+- [hooks-automation](../hooks-automation/) - Claude Flow hooks for automation
+## Resources
+- [Ralph Wiggum Technique](https://ghuntley.com/ralph/) - Original methodology
+- [Ralph Orchestrator](https://github.com/mikeyobrien/ralph-orchestrator) - Orchestration tools
+- [Claude Code Plugins](https://github.com/anthropics/claude-code) - Official plugins
+---
+**Origin**: Based on Ralph Wiggum plugin from claude-code repository (anthropics/claude-code)
+**Adapted for**: Agentic QE v3 with Claude Flow integration

package/.claude/skills/skills-manifest.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
   "version": "1.3.0",
   "generated": "2026-02-04T00:00:00.000Z",
-  "totalSkills": 46,
-  "totalQESkills": 78,
-  "description": "Agentic QE Fleet skills manifest - 46 Tier 3 verified QE skills. Total 78 QE skills on disk (46 Tier 3 + 32 additional including QCSD swarms, n8n testing, enterprise integration, pentest validation, qe-* domains).",
+  "totalSkills": 48,
+  "totalQESkills": 80,
+  "description": "Agentic QE Fleet skills manifest - 48 Tier 3 verified QE skills. Total 80 QE skills on disk (48 Tier 3 + 32 additional including QCSD swarms, n8n testing, enterprise integration, pentest validation, qe-* domains).",
   "categories": {
     "qcsd-phases": {
       "description": "QCSD (Quality Conscious Software Delivery) phase swarms for shift-left quality",
@@ -33,6 +33,7 @@
       "skills": [
         "security-testing",
         "pentest-validation",
+        "validation-pipeline",
         "performance-testing",
         "accessibility-testing",
         "a11y-ally",
@@ -822,6 +823,33 @@
       "dependencies": [],
       "agents": ["qe-quality-analyzer"]
     },
+    "validation-pipeline": {
+      "id": "validation-pipeline",
+      "name": "Validation Pipeline",
+      "description": "Run structured 13-step requirements validation pipeline with gate enforcement, weighted scoring, and detailed findings. Supports format checking, vague term detection, testability analysis, and more.",
+      "category": "specialized-testing",
+      "priority": "high",
+      "file": "validation-pipeline/SKILL.md",
+      "tokenEstimate": 1500,
+      "optimizationStatus": "active",
+      "optimizationVersion": "1.0",
+      "lastOptimized": "2026-03-12",
+      "tags": ["validation", "requirements", "pipeline", "gate-enforcement", "scoring", "bmad"],
+      "dependencies": ["agentic-quality-engineering"],
+      "agents": ["qe-requirements-validator"],
+      "relatedSkills": ["verification-quality", "shift-left-testing", "context-driven-testing"],
+      "triggers": {
+        "patterns": [
+          "validate requirements",
+          "requirements validation",
+          "validation pipeline",
+          "run validation",
+          "check requirements quality"
+        ],
+        "autoInvoke": false,
+        "enforcementLevel": "standard"
+      }
+    },
     "bug-reporting-excellence": {
       "id": "bug-reporting-excellence",
       "name": "Bug Reporting Excellence",
@@ -904,7 +932,7 @@
   },
   "metadata": {
     "generatedBy": "Agentic QE Fleet",
-    "fleetVersion": "3.7.18",
+    "fleetVersion": "3.7.20",
     "manifestVersion": "1.3.0",
     "lastUpdated": "2026-02-04T00:00:00.000Z",
     "contributors": [
@@ -914,13 +942,13 @@
         "url": "https://github.com/fndlalit"
       }
     ],
-    "notes": "This manifest tracks 46 Tier 3 verified QE skills. Total 78 QE skills on disk (46 Tier 3 + 32 additional). Claude Flow platform skills (33) are managed separately.",
+    "notes": "This manifest tracks 48 Tier 3 verified QE skills. Total 80 QE skills on disk (48 Tier 3 + 32 additional). Claude Flow platform skills (33) are managed separately.",
     "skillBreakdown": {
-      "qeSkillsTier3": 46,
+      "qeSkillsTier3": 48,
       "qeSkillsAdditional": 32,
-      "totalQESkills": 78,
+      "totalQESkills": 80,
       "platformSkills": 36,
-      "totalOnDisk": 111
+      "totalOnDisk": 113
     },
     "excludedCategories": {
       "reason": "Platform skills are available but not tracked in this QE-focused manifest",

package/.claude/skills/validation-pipeline/SKILL.md ADDED Viewed

@@ -0,0 +1,164 @@
+---
+name: "Validation Pipeline"
+description: "Structured step-by-step validation of requirements, code, and artifacts with gate enforcement, per-step scoring, and structured reports."
+trust_tier: 3
+validation:
+  schema_path: schemas/output.json
+  validator_path: scripts/validate-config.json
+  eval_path: evals/validation-pipeline.yaml
+---
+# Validation Pipeline
+## Purpose
+Run structured validation pipelines that execute steps sequentially, enforce gates at blocking failures, and produce scored reports. Uses the `src/validation/pipeline.ts` framework with 13 requirements validation steps (BMAD-003).
+## Activation
+- When validating requirements documents
+- When running structured quality gates
+- When assessing document completeness, testability, or traceability
+- When invoked via `/validation-pipeline`
+## Quick Start
+```bash
+# Validate a requirements document (all 13 steps)
+/validation-pipeline requirements docs/requirements.md
+# Validate with specific steps only
+/validation-pipeline requirements docs/requirements.md --steps format-check,completeness-check,invest-criteria
+# Continue past blocking failures
+/validation-pipeline requirements docs/requirements.md --continue-on-failure
+# Output as JSON
+/validation-pipeline requirements docs/requirements.md --json
+```
+## Workflow
+### Step 1: Read the Target Document
+Read the file specified by the user. If no file is provided, ask for one.
+```
+Read the target document using the Read tool.
+Store the content for pipeline execution.
+```
+### Step 2: Select Pipeline
+Choose the appropriate pipeline based on the user's request:
+| Pipeline | Steps | Use Case |
+|----------|-------|----------|
+| `requirements` | 13 | Requirements documents, PRDs, user stories |
+Additional pipelines can be created by defining new step sets in `src/validation/steps/`.
+### Step 3: Execute Pipeline
+The pipeline framework (`src/validation/pipeline.ts`) handles execution:
+1. **Sequential execution** — steps run in order, each receiving results from prior steps
+2. **Gate enforcement** — blocking steps that fail halt the pipeline (unless `--continue-on-failure`)
+3. **Per-step scoring** — each step produces a 0-100 score with findings and evidence
+4. **Weighted rollup** — overall score uses category weights (format=10%, content=30%, quality=25%, traceability=20%, compliance=15%)
+#### Requirements Pipeline Steps (13 total)
+| # | Step ID | Category | Severity | What It Checks |
+|---|---------|----------|----------|----------------|
+| 1 | `format-check` | format | blocking | Headings, required sections, document length |
+| 2 | `completeness-check` | content | blocking | Required fields populated, acceptance criteria present |
+| 3 | `invest-criteria` | quality | warning | Independent, Negotiable, Valuable, Estimable, Small, Testable |
+| 4 | `smart-acceptance` | quality | warning | Specific, Measurable, Achievable, Relevant, Time-bound |
+| 5 | `testability-score` | quality | warning | Can each requirement be tested? |
+| 6 | `vague-term-detection` | content | info | Flags "should", "might", "various", "etc." |
+| 7 | `information-density` | content | info | Every sentence carries weight, no filler |
+| 8 | `traceability-check` | traceability | warning | Requirements-to-tests mapping exists |
+| 9 | `implementation-leakage` | quality | warning | Requirements don't prescribe implementation |
+| 10 | `domain-compliance` | compliance | info | Alignment with domain model |
+| 11 | `dependency-analysis` | traceability | info | Cross-requirement dependencies identified |
+| 12 | `bdd-scenario-generation` | quality | warning | Can generate Given/When/Then for each requirement |
+| 13 | `holistic-quality` | compliance | blocking | Overall coherence, no contradictions |
+### Step 4: Report Results
+Format the pipeline result as a structured report:
+```markdown
+# Validation Report: Requirements Pipeline
+**Overall**: PASS/FAIL/WARN | **Score**: 85/100 | **Duration**: 42ms
+## Step Results
+| # | Step | Status | Score | Findings | Duration |
+|---|------|--------|-------|----------|----------|
+| 1 | Format Check | PASS | 100 | 0 | 2ms |
+| 2 | Completeness | WARN | 60 | 2 | 5ms |
+...
+## Blockers
+- (blocking findings listed here)
+## All Findings
+- [HIGH] Missing acceptance criteria: Requirement US-104 has no AC
+- [MEDIUM] Vague term: "should" used 5 times without specifics
+...
+```
+### Step 5: Record Learning
+After pipeline execution, record the outcome for learning:
+```typescript
+// Store validation pattern
+memory store --namespace validation-pipeline --key "req-validation-{timestamp}" --value "{score, findings_count, halted}"
+```
+## Parameters
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `pipeline` | string | `requirements` | Pipeline type to run |
+| `file` | string | required | Path to document to validate |
+| `--steps` | string[] | all | Specific step IDs to run |
+| `--continue-on-failure` | boolean | false | Skip blocking gates |
+| `--json` | boolean | false | Output as JSON instead of markdown |
+| `--metadata` | object | {} | Additional context for steps |
+## Integration Points
+- **qe-requirements-validator agent** — delegates structured validation to this pipeline
+- **qe-quality-gate agent** — uses pipeline for gate evaluation
+- **YAML Pipelines** — can invoke validation steps as workflow actions
+- **MCP** — accessible via `pipeline_validate` tool
+## Output Schema
+The pipeline produces a `PipelineResult` object (see `schemas/output.json`):
+```typescript
+{
+  pipelineId: string;
+  pipelineName: string;
+  overall: 'pass' | 'fail' | 'warn';
+  score: number;           // 0-100 weighted average
+  steps: StepResult[];     // per-step details
+  blockers: Finding[];     // blocking findings
+  halted: boolean;
+  haltedAt?: string;       // step ID where halted
+  totalDuration: number;
+  timestamp: string;
+}
+```
+## Error Handling
+- **Step throws exception** — captured as a FAIL with critical finding, pipeline continues or halts per severity
+- **File not found** — report error, do not run pipeline
+- **Empty document** — format-check step will catch this as a blocking failure