npm - agentic-qe - Versions diffs - 3.8.2 → 3.8.3 - Mend

agentic-qe 3.8.2 → 3.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (217) hide show

package/.claude/skills/README.md CHANGED Viewed

@@ -4,9 +4,9 @@ This directory contains Quality Engineering skills managed by Agentic QE.
 ## Summary
-- **Total QE Skills**: 77
+- **Total QE Skills**: 84
 - **V2 Methodology Skills**: 62
-- **V3 Domain Skills**: 15
+- **V3 Domain Skills**: 23
 - **Platform Skills**: 30 (Claude Flow managed)
 - **Validation Infrastructure**: ✅ Installed
@@ -80,26 +80,48 @@ Version-agnostic quality engineering best practices from the QE community.
 - **wms-testing-patterns**: Warehouse Management System testing patterns for inventory operations, pick/pack/ship workflows, wave management, EDI X12/EDIFACT compliance, RF/barcode scanning, and WMS-ERP integration. Use when testing WMS platforms (Blue Yonder, Manhattan, SAP EWM).
 - **xp-practices**: Apply XP practices including pair programming, ensemble programming, continuous integration, and sustainable pace. Use when implementing agile development practices, improving team collaboration, or adopting technical excellence practices.
-## V3 Domain Skills (15)
+## V3 Domain Skills (23)
-V3-specific implementation guides for the 12 DDD bounded contexts.
+V3-specific implementation guides for the 12 DDD bounded contexts plus on-demand hooks, investigation runbooks, and measurement tools.
-- **aqe-v2-v3-migration**: Migrate Agentic QE projects from v2 to v3 with zero data loss
 - **pentest-validation**: Orchestrate security finding validation through graduated exploitation. 4-phase pipeline: recon (SAST/DAST), analysis (code review), validation (exploit proof), report (No Exploit, No Report gate). Eliminates false positives by proving exploitability.
 - **qe-chaos-resilience**: Chaos engineering and resilience testing including fault injection, load testing, and system recovery validation.
 - **qe-code-intelligence**: Knowledge graph-based code understanding with semantic search and 80% token reduction through intelligent context retrieval.
-- **qe-contract-testing**: Consumer-driven contract testing for APIs including REST, GraphQL, and event-driven systems with schema validation.
 - **qe-coverage-analysis**: O(log n) sublinear coverage gap detection with risk-weighted analysis and intelligent test prioritization.
 - **qe-defect-intelligence**: AI-powered defect prediction, pattern learning, and root cause analysis for proactive quality management.
 - **qe-iterative-loop**: Quality Engineering iteration loops for autonomous test improvement, coverage achievement, and quality gate compliance. Use when tests need to pass, coverage targets must be met, quality gates require compliance, or flaky tests need stabilization. Integrates with AQE v3 fleet agents for coordinated quality iteration.
 - **qe-learning-optimization**: Transfer learning, metrics optimization, and continuous improvement for AI-powered QE agents.
 - **qe-quality-assessment**: Comprehensive quality gates, metrics analysis, and deployment readiness assessment for continuous quality assurance.
 - **qe-requirements-validation**: Requirements traceability, acceptance criteria validation, and BDD scenario management for complete requirements coverage.
-- **qe-security-compliance**: Security auditing, vulnerability scanning, and compliance validation for OWASP, SOC2, GDPR, and other standards.
 - **qe-test-execution**: Parallel test execution orchestration with intelligent scheduling, retry logic, and comprehensive result aggregation.
 - **qe-test-generation**: AI-powered test generation using pattern recognition, code analysis, and intelligent test synthesis for comprehensive test coverage.
 - **qe-visual-accessibility**: Visual regression testing, responsive design validation, and WCAG accessibility compliance testing.
+### On-Demand Hooks
+- **strict-tdd**: Enforces strict TDD red-green-refactor discipline. Blocks code changes that lack a failing test first.
+- **no-skip**: Prevents use of .skip, .only, or xdescribe in test files. Ensures all tests are active.
+- **coverage-guard**: Blocks merges when coverage drops below configured thresholds. Enforces coverage-only-goes-up policy.
+- **freeze-tests**: Locks the test suite during stabilization periods. Prevents new test additions or modifications.
+- **security-watch**: Scans code changes for security anti-patterns (hardcoded secrets, SQL injection, etc.) on every commit.
+### Runbook Skills
+- **test-failure-investigator**: Automated runbook for investigating test failures. Gathers logs, diffs, recent changes, and flaky-test history to diagnose root cause.
+- **coverage-drop-investigator**: Automated runbook for diagnosing coverage drops. Identifies uncovered lines, maps to recent commits, and suggests targeted tests.
+### Product Verification
+- **e2e-flow-verifier**: End-to-end user flow verification against acceptance criteria. Validates critical paths through the application match expected behavior.
+### Data & Analysis
+- **test-metrics-dashboard**: Aggregates test results, coverage trends, flaky-test rates, and execution times into a unified dashboard view.
+### Measurement
+- **skill-stats**: Measures skill usage frequency, success rates, and token costs. Provides data for skill portfolio optimization.
 ## Platform Skills (30)
 Claude Flow platform skills (managed separately).

package/.claude/skills/TRUST-TIERS.md CHANGED Viewed

@@ -5,30 +5,30 @@
 ## Overview
-![Tier 3 (Verified)](https://img.shields.io/badge/Tier%203%20(Verified)-46-brightgreen)
+![Tier 3 (Verified)](https://img.shields.io/badge/Tier%203%20(Verified)-44-brightgreen)
 ![Tier 2 (Validated)](https://img.shields.io/badge/Tier%202%20(Validated)-7-green)
 ![Tier 1 (Structured)](https://img.shields.io/badge/Tier%201%20(Structured)-5-yellow)
-![Tier 0 (Advisory)](https://img.shields.io/badge/Tier%200%20(Advisory)-39-lightgrey)
+![Tier 0 (Advisory)](https://img.shields.io/badge/Tier%200%20(Advisory)-49-lightgrey)
-**Total Skills**: 97
+**Total Skills**: 105
 ## Trust Tier Distribution
 | Tier | Count | Description |
 |------|-------|-------------|
-| 3 - Verified | 46 | Full evaluation test suite |
+| 3 - Verified | 44 | Full evaluation test suite |
 | 2 - Validated | 7 | Has executable validator |
 | 1 - Structured | 5 | Has JSON output schema |
-| 0 - Advisory | 39 | SKILL.md only |
+| 0 - Advisory | 49 | SKILL.md only |
 ## Validation Status
 | Status | Count |
 |--------|-------|
-| Passing | 46 |
+| Passing | 44 |
 | Failing | 0 |
 | Unknown | 12 |
-| Skipped | 39 |
+| Skipped | 49 |
 ---
@@ -60,13 +60,11 @@ These skills have complete validation infrastructure: JSON schema, validator scr
 | qcsd-ideation-swarm | qcsd-phases | `schemas/output.json` | `scripts/validate-config.json` | `evals/qcsd-ideation-swarm.yaml` | Passing |
 | qe-chaos-resilience | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-chaos-resilience.yaml` | Passing |
 | qe-code-intelligence | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-code-intelligence.yaml` | Passing |
-| qe-contract-testing | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-contract-testing.yaml` | Passing |
 | qe-coverage-analysis | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-coverage-analysis.yaml` | Passing |
 | qe-defect-intelligence | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-defect-intelligence.yaml` | Passing |
 | qe-learning-optimization | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-learning-optimization.yaml` | Passing |
 | qe-quality-assessment | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-quality-assessment.yaml` | Passing |
 | qe-requirements-validation | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-requirements-validation.yaml` | Passing |
-| qe-security-compliance | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-security-compliance.yaml` | Passing |
 | qe-test-execution | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-test-execution.yaml` | Passing |
 | qe-test-generation | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-test-generation.yaml` | Passing |
 | qe-visual-accessibility | - | `schemas/output.json` | `scripts/validate-config.json` | `evals/qe-visual-accessibility.yaml` | Passing |
@@ -110,13 +108,31 @@ These skills have a JSON output schema but no validator yet.
 | Skill | Category | Schema |
 |-------|----------|--------|
 | agentic-quality-engineering | qe-core | `schemas/output.json` |
-| aqe-v2-v3-migration | - | `schemas/output.json` |
 | consultancy-practices | professional-practice | `schemas/output.json` |
 | technical-writing | communication | `schemas/output.json` |
 | test-environment-management | specialized-testing | `schemas/output.json` |
 ---
+## Tier 0 Skills (Advisory)
+These skills have SKILL.md only, with no evaluation infrastructure yet.
+| Skill | Category |
+|-------|----------|
+| coverage-drop-investigator | investigation |
+| coverage-guard | on-demand-hooks |
+| e2e-flow-verifier | product-verification |
+| freeze-tests | on-demand-hooks |
+| no-skip | on-demand-hooks |
+| security-watch | on-demand-hooks |
+| skill-stats | measurement |
+| strict-tdd | on-demand-hooks |
+| test-failure-investigator | investigation |
+| test-metrics-dashboard | analytics |
+---
 ## Upgrading Skills
 To upgrade a skill to a higher trust tier:

package/.claude/skills/a11y-ally/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: a11y-ally
-description: "Comprehensive WCAG accessibility auditing with multi-tool testing (axe-core + pa11y + Lighthouse), TRUE PARALLEL execution with Promise.allSettled, graceful degradation, retry with backoff, context-aware remediation, learning integration, and video accessibility. Uses 3-tier browser cascade: Vibium → agent-browser → Playwright+Stealth."
+description: "Use when running comprehensive WCAG accessibility audits with axe-core + pa11y + Lighthouse, generating context-aware remediation, or testing video accessibility. Supports 3-tier browser cascade with graceful degradation."
 category: specialized-testing
 priority: critical
 tokenEstimate: 10000
@@ -1661,3 +1661,12 @@ ROI = (Impact × Users%) / Effort_Hours
 11. **NEVER** skip video pipeline if videos detected
 12. **NEVER** complete without remediation.md
 13. **NEVER** fail audit just because 1-2 tools failed (use graceful degradation)
+## Gotchas
+- axe-core catches ~30% of WCAG issues — automated tools miss keyboard navigation, reading order, and cognitive issues
+- Agent runs Lighthouse only and reports "accessible" — Lighthouse alone is insufficient, always run axe-core + pa11y too
+- Screen reader testing requires actual screen reader interaction, not just ARIA attribute checks
+- Video accessibility (captions, audio descriptions) is frequently skipped — check every `<video>` element
+- Color contrast tools disagree on gradients and transparency — test with actual low-vision simulation
+- Playwright+Stealth may be blocked by some sites — fall back gracefully, don't skip the audit

package/.claude/skills/accessibility-testing/SKILL.md CHANGED Viewed

@@ -21,6 +21,8 @@ validation:
 # Accessibility Testing
+> **Consolidated**: For comprehensive WCAG auditing with multi-tool testing (axe-core + pa11y + Lighthouse), video accessibility, and remediation, prefer [`/a11y-ally`](../a11y-ally/). This skill provides a quick reference card for basic accessibility testing patterns.
 <default_to_action>
 When testing accessibility or ensuring compliance:
 1. APPLY POUR principles: Perceivable, Operable, Understandable, Robust

package/.claude/skills/agentic-quality-engineering/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: agentic-quality-engineering
-description: "AI agents as force multipliers for quality work. Core skill for all 19 QE agents using PACT principles."
+description: "Use when orchestrating QE agents, understanding PACT principles, configuring the AQE v3 fleet, or leveraging AI agents as force multipliers for quality work."
 category: qe-core
 priority: critical
 tokenEstimate: 1400

package/.claude/skills/api-testing-patterns/SKILL.md CHANGED Viewed

@@ -297,3 +297,11 @@ await apiFleet.execute({
 API testing = verifying contracts and behavior, not implementation. Focus on what matters to consumers: correct responses, proper error handling, acceptable performance.
 **With Agents:** Agents automate contract validation, generate comprehensive test suites from specs, and monitor production APIs for drift. Use agents to maintain API quality at scale.
+## Gotchas
+- Agent generates tests against documented API, not actual API — always validate against running service first
+- Auth tokens expire between test runs — use fixtures with long-lived tokens or refresh before each suite
+- Rate limiting in CI causes intermittent failures — add retry with exponential backoff for 429 responses
+- GraphQL introspection may be disabled in production — test against staging schema, not production endpoint
+- Idempotency tests need unique request IDs per run — hardcoded IDs cause false passes on retry

package/.claude/skills/api-testing-patterns/config.json ADDED Viewed

@@ -0,0 +1,14 @@
+{
+  "$schema": "./config-schema.json",
+  "_description": "API Testing configuration. Auto-created on first run. Edit to customize.",
+  "api_type": null,
+  "auth_type": null,
+  "base_url": null,
+  "options": {
+    "validateSchemaOnEveryRequest": true,
+    "retryOn429": true,
+    "retryDelay": 1000,
+    "timeout": 30000
+  },
+  "_setupPrompt": "If api_type is null, ask: 'What type of API are you testing? (rest/graphql/grpc)'. If auth_type is null, ask: 'What authentication does the API use? (bearer/oauth2/api-key/basic/none)'. If base_url is null, ask: 'What is the base URL for the API under test?'"
+}

package/.claude/skills/api-testing-patterns/templates/api-test-scaffold.md ADDED Viewed

@@ -0,0 +1,87 @@
+# API Test Scaffold Template
+## REST API Test Structure (Jest/Supertest)
+```typescript
+import request from 'supertest';
+import { app } from '../src/app';
+describe('{{Resource}} API', () => {
+  // Setup
+  let authToken: string;
+  beforeAll(async () => {
+    authToken = await getTestToken();
+  });
+  describe('GET /api/{{resource}}', () => {
+    it('returns paginated list with valid auth', async () => {
+      const res = await request(app)
+        .get('/api/{{resource}}')
+        .set('Authorization', `Bearer ${authToken}`)
+        .query({ page: 1, limit: 10 });
+      expect(res.status).toBe(200);
+      expect(res.body.data).toBeInstanceOf(Array);
+      expect(res.body.pagination).toMatchObject({
+        page: 1,
+        limit: 10,
+        total: expect.any(Number)
+      });
+    });
+    it('returns 401 without auth', async () => {
+      const res = await request(app).get('/api/{{resource}}');
+      expect(res.status).toBe(401);
+    });
+    it('returns 400 for invalid query params', async () => {
+      const res = await request(app)
+        .get('/api/{{resource}}')
+        .set('Authorization', `Bearer ${authToken}`)
+        .query({ page: -1 });
+      expect(res.status).toBe(400);
+    });
+  });
+  describe('POST /api/{{resource}}', () => {
+    it('creates resource with valid payload', async () => {
+      const payload = { /* valid fields */ };
+      const res = await request(app)
+        .post('/api/{{resource}}')
+        .set('Authorization', `Bearer ${authToken}`)
+        .send(payload);
+      expect(res.status).toBe(201);
+      expect(res.body.id).toBeDefined();
+    });
+    it('returns 422 for invalid payload', async () => {
+      const res = await request(app)
+        .post('/api/{{resource}}')
+        .set('Authorization', `Bearer ${authToken}`)
+        .send({});
+      expect(res.status).toBe(422);
+      expect(res.body.errors).toBeDefined();
+    });
+    it('is idempotent with same request ID', async () => {
+      const requestId = crypto.randomUUID();
+      const payload = { /* valid fields */ };
+      const res1 = await request(app)
+        .post('/api/{{resource}}')
+        .set('Authorization', `Bearer ${authToken}`)
+        .set('X-Request-ID', requestId)
+        .send(payload);
+      const res2 = await request(app)
+        .post('/api/{{resource}}')
+        .set('Authorization', `Bearer ${authToken}`)
+        .set('X-Request-ID', requestId)
+        .send(payload);
+      expect(res1.body.id).toBe(res2.body.id);
+    });
+  });
+});
+```

package/.claude/skills/bug-reporting-excellence/SKILL.md CHANGED Viewed

@@ -227,3 +227,17 @@ const bugFleet = await FleetManager.coordinate({
 Your bug report is the starting point for someone else's work. Make it **complete** (all info needed), **clear** (anyone can follow), **concise** (no noise), and **actionable** (developer knows next step).
 **Good bug reports = Faster fixes = Better product = Happier users**
+## Skill Composition
+- **After finding bug** → Start with `/test-failure-investigator` for root cause
+- **Prevent regression** → Use `/regression-testing` to add test preventing recurrence
+- **Track quality** → Feed into `/test-metrics-dashboard` for trend analysis
+## Gotchas
+- Agent omits environment details (OS, browser version, node version) — these are critical for reproduction
+- "Steps to reproduce" that start with "1. Open the app" are useless — specify exact URL, user role, and state
+- Agent combines multiple bugs into one report — enforce ONE BUG = ONE REPORT strictly
+- Screenshots without annotations don't help — always highlight the actual error area
+- Severity assessment is often wrong — agent marks everything as "critical" or everything as "low"

package/.claude/skills/code-review-quality/SKILL.md CHANGED Viewed

@@ -232,3 +232,17 @@ const reviewFleet = await FleetManager.coordinate({
 **Prioritize feedback:** 🔴 Blocker → 🟡 Major → 🟢 Minor → 💡 Suggestion. Focus on bugs and security, not style. Ask questions, don't command. Review < 400 lines at a time. Fast feedback (< 24h) beats thorough feedback.
 **With Agents:** Agents automate security, performance, and coverage checks, freeing human reviewers to focus on logic and design. Use agents for consistent, fast initial review.
+## Skill Composition
+- **Security concerns** → Compose with `/security-testing` for security-focused review
+- **Coverage check** → Run `/qe-coverage-analysis` on changed files
+- **Ship decision** → Feed review results into `/qe-quality-assessment`
+## Gotchas
+- Agent reviews >400 lines at once and misses issues — chunk reviews to 200-400 lines maximum
+- Nitpicking style while missing logic bugs is the #1 agent review failure — prioritize correctness over formatting
+- Agent approves code that compiles but has subtle race conditions — always check shared state and async patterns
+- Review comments without suggested fixes are unhelpful — always include a proposed alternative
+- Agent doesn't check if the PR actually solves the linked issue — verify the stated problem is actually fixed

package/.claude/skills/compatibility-testing/SKILL.md CHANGED Viewed

@@ -47,57 +47,6 @@ When validating cross-browser/platform compatibility:
 - Launching in new markets
 - Responsive design validation
-### Browser Matrix
-| Browser | Versions | Priority |
-|---------|----------|----------|
-| **Chrome** | Latest, N-1 | High |
-| **Firefox** | Latest, N-1 | High |
-| **Safari** | Latest, N-1 | High |
-| **Edge** | Latest | Medium |
-| **Mobile Safari** | iOS latest | High |
-| **Mobile Chrome** | Android latest | High |
-### Screen Breakpoints
-| Category | Width Range |
-|----------|-------------|
-| **Mobile** | 320px - 480px |
-| **Tablet** | 481px - 768px |
-| **Desktop** | 769px - 1920px+ |
----
-## Responsive Design Testing
-```javascript
-import { test, expect } from '@playwright/test';
-const devices = [
-  { name: 'iPhone 12', width: 390, height: 844 },
-  { name: 'iPad', width: 768, height: 1024 },
-  { name: 'Desktop', width: 1920, height: 1080 }
-];
-for (const device of devices) {
-  test(`layout on ${device.name}`, async ({ page }) => {
-    await page.setViewportSize({
-      width: device.width,
-      height: device.height
-    });
-    await page.goto('https://example.com');
-    const nav = await page.locator('nav');
-    if (device.width < 768) {
-      // Mobile: hamburger menu
-      expect(await nav.locator('.hamburger')).toBeVisible();
-    } else {
-      // Desktop: full menu
-      expect(await nav.locator('.menu-items')).toBeVisible();
-    }
-  });
-}
-```
 ---
 ## Cross-Browser with Playwright
@@ -202,8 +151,6 @@ const compatFleet = await FleetManager.coordinate({
 ## Remember
-**Test where users are, not where you develop.** Developers use latest Chrome on high-end machines. Users access from older browsers, low-end devices, and slow networks.
 **Cover 95%+ of your user base.** Use analytics to identify actual browser/device usage. Don't waste time on browsers nobody uses.
-**With Agents:** Agents orchestrate parallel cross-browser testing across cloud platforms, reducing 10 hours of manual testing to 15 minutes. `qe-visual-tester` catches visual inconsistencies across platforms automatically.
+**With Agents:** Agents orchestrate parallel cross-browser testing across cloud platforms. `qe-visual-tester` catches visual inconsistencies across platforms automatically.

package/.claude/skills/compliance-testing/SKILL.md CHANGED Viewed

@@ -229,3 +229,11 @@ const complianceFleet = await FleetManager.coordinate({
 **Audit trail everything.** Every access to sensitive data, every consent, every deletion must be logged with timestamps and user IDs.
 **With Agents:** Agents validate compliance requirements continuously, detect violations early, and generate audit-ready reports. Catch compliance issues in development, not in audits.
+## Gotchas
+- Agent checks GDPR consent flow but misses data retention — always verify deletion/anonymization actually works
+- Compliance reports with "100% compliant" are suspicious — no real system is fully compliant, verify each claim
+- Agent may test US regulations only — explicitly specify jurisdiction (EU, CA, etc.) for correct requirements
+- PII in test data is itself a compliance violation — never use production PII, use synthetic generators
+- Audit trail gaps are invisible until audit time — verify logging exists for EVERY data access, not just writes

package/.claude/skills/compliance-testing/config.json ADDED Viewed

@@ -0,0 +1,13 @@
+{
+  "$schema": "./config-schema.json",
+  "_description": "Compliance Testing configuration. Auto-created on first run. Edit to customize.",
+  "regulations": [],
+  "scope": null,
+  "options": {
+    "dataClassification": null,
+    "retentionPolicyDays": null,
+    "auditLogRequired": true,
+    "piiScanEnabled": true
+  },
+  "_setupPrompt": "If regulations is empty, ask: 'Which regulations apply to this project? (gdpr/ccpa/hipaa/soc2/pci-dss — comma-separated)'. If scope is null, ask: 'What is the compliance scope? (full-app/api-only/data-layer/specific-module)'."
+}

package/.claude/skills/consultancy-practices/SKILL.md CHANGED Viewed

@@ -40,28 +40,9 @@ When consulting on quality:
 ## Quick Reference Card
-### The Consulting Process
-| Phase | Duration | Goal | Deliverable |
-|-------|----------|------|-------------|
-| **Discovery** | Week 1-2 | Understand context | Interview notes, observations |
-| **Analysis** | Week 2-3 | Identify root causes | Impact/effort matrix |
-| **Recommendations** | Week 3-4 | Present findings | Report with roadmap |
-| **Implementation** | Month 2-6+ | Execute changes | Working system, trained team |
-| **Transition** | Final month | Ensure self-sufficiency | Handover docs |
-### Impact/Effort Matrix
-| Priority | What | Action |
-|----------|------|--------|
-| High Impact, Low Effort | Quick wins | Do first |
-| High Impact, High Effort | Major initiatives | Plan carefully |
-| Low Impact, Low Effort | Nice-to-haves | If time permits |
-| Low Impact, High Effort | Distractions | Skip |
 ---
-## Common Patterns
+## Common Patterns (What Clients Say vs. What They Mean)
 ### "We Need Test Automation"
@@ -110,18 +91,6 @@ When consulting on quality:
 ---
-## Anti-Patterns
-| Anti-Pattern | Problem | Better |
-|--------------|---------|--------|
-| **Cookie-Cutter** | Same solution everywhere | Context-specific recommendations |
-| **Tool Pusher** | Recommend expensive tools | Tools that solve actual problems |
-| **Process Nazi** | Impose rigid process | Lightweight, fits their culture |
-| **Permanent Fixture** | Never leave, create dependency | Work toward them not needing you |
-| **Blame Game** | Point fingers at people | Fix systems, not blame people |
----
 ## Difficult Situations
 **"We already tried that"**

package/.claude/skills/context-driven-testing/SKILL.md CHANGED Viewed

@@ -45,23 +45,6 @@ When making testing decisions or adapting approaches:
 - Adapting approach to specific constraints
 - Exploratory testing sessions
-### Seven Context-Driven Principles
-1. Value of any practice depends on its context
-2. Good practices in context, no universal best practices
-3. People working together are most important
-4. Projects unfold in unpredictable ways
-5. Product is a solution - if problem not solved, product fails
-6. Good testing is challenging intellectual work
-7. Judgment and skill determine right things at right times
-### Context Factors
-| Factor | Questions |
-|--------|-----------|
-| **Project** | Business goal? User needs? Failure impact? |
-| **Constraints** | Timeline? Budget? Team skills? Legacy? |
-| **Risk** | Safety-critical? Regulated? High volume? |
-| **Technical** | Stack quirks? Integrations? Observability? |
 ### RST Heuristics
 | Heuristic | Application |
 |-----------|-------------|
@@ -97,26 +80,6 @@ When making testing decisions or adapting approaches:
 ---
-## Investigation vs. Checking
-| Checking | Testing (Investigation) |
-|----------|------------------------|
-| Did API return 200? | Does API meet user needs? |
-| Does button work? | What happens under load? |
-| Match the spec? | Does it solve the problem? |
----
-## Red Flags: Not Context-Driven
-- Follow process "because that's how it's done"
-- Can't explain *why* you're doing something
-- Measure test cases executed, not problems found
-- Test plan could apply to any project
-- Stop thinking once you have a script
----
 ## Agent-Assisted Context-Driven Testing
 ```typescript
@@ -191,8 +154,4 @@ const contextFleet = await FleetManager.coordinate({
 ## Remember
-**Context drives decisions.** No universal best practices. Skilled testers make informed decisions based on specific goals, constraints, and risks.
-You're not a test script executor. You're a skilled investigator helping teams build better products.
 **With Agents:** Agents analyze context, adapt strategies, and learn what works in your situation. Use agents to scale context-driven thinking while maintaining human judgment for critical decisions.

package/.claude/skills/contract-testing/SKILL.md CHANGED Viewed

@@ -213,6 +213,10 @@ const contractFleet = await FleetManager.coordinate({
 ---
+## Agent CLI & Advanced Patterns
+For v3 agent-specific commands (`aqe contract ...`), GraphQL contracts, event contracts, and Pact Broker integration, see [references/agent-commands.md](references/agent-commands.md).
 ## Related Skills
 - [api-testing-patterns](../api-testing-patterns/) - API testing strategies
 - [shift-left-testing](../shift-left-testing/) - Early contract validation
@@ -225,3 +229,11 @@ const contractFleet = await FleetManager.coordinate({
 **Consumers own the contract.** They define what they need; providers must fulfill it. Breaking changes require major version bumps and coordination. CI/CD blocks deploys that break contracts. Use Pact for consumer-driven, OpenAPI for API-first.
 **With Agents:** Agents validate contracts, detect breaking changes with semver recommendations, and generate migration guides. Use agents to maintain contract compliance at scale.
+## Gotchas
+- Pact broker URL must be configured before running — agent will generate tests that silently skip verification without it
+- Consumer tests pass locally but fail in CI when provider states aren't set up — always verify both sides
+- Adding a required field to a response is a BREAKING change even though provider tests pass — consumer didn't expect it
+- Agent may generate contracts from API docs instead of actual consumer usage — contracts must reflect real consumer needs
+- GraphQL contract testing requires schema stitching awareness — fragments may reference types from other services

package/.claude/skills/contract-testing/config.json ADDED Viewed

@@ -0,0 +1,13 @@
+{
+  "$schema": "./config-schema.json",
+  "_description": "Contract Testing configuration. Auto-created on first run. Edit to customize.",
+  "broker_url": null,
+  "consumer_name": null,
+  "provider_name": null,
+  "options": {
+    "publishVerificationResults": true,
+    "enablePending": true,
+    "includeWipPactsSince": null
+  },
+  "_setupPrompt": "If broker_url is null, ask: 'What is your Pact Broker URL? (or \"local\" for file-based)'. If consumer_name is null, ask: 'What is the consumer service name?'. If provider_name is null, ask: 'What is the provider service name?'"
+}