npm - agentic-qe - Versions diffs - 2.0.0 → 2.1.1 - Mend

agentic-qe 2.0.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (144) hide show

package/.claude/agents/qx-partner.md +17 -4
package/.claude/skills/accessibility-testing/SKILL.md +144 -692
package/.claude/skills/agentic-quality-engineering/SKILL.md +176 -529
package/.claude/skills/api-testing-patterns/SKILL.md +180 -560
package/.claude/skills/brutal-honesty-review/SKILL.md +113 -603
package/.claude/skills/bug-reporting-excellence/SKILL.md +116 -517
package/.claude/skills/chaos-engineering-resilience/SKILL.md +127 -72
package/.claude/skills/cicd-pipeline-qe-orchestrator/SKILL.md +209 -404
package/.claude/skills/code-review-quality/SKILL.md +158 -608
package/.claude/skills/compatibility-testing/SKILL.md +148 -38
package/.claude/skills/compliance-testing/SKILL.md +132 -63
package/.claude/skills/consultancy-practices/SKILL.md +114 -446
package/.claude/skills/context-driven-testing/SKILL.md +117 -381
package/.claude/skills/contract-testing/SKILL.md +176 -141
package/.claude/skills/database-testing/SKILL.md +137 -130
package/.claude/skills/exploratory-testing-advanced/SKILL.md +160 -629
package/.claude/skills/holistic-testing-pact/SKILL.md +140 -188
package/.claude/skills/localization-testing/SKILL.md +145 -33
package/.claude/skills/mobile-testing/SKILL.md +132 -448
package/.claude/skills/mutation-testing/SKILL.md +147 -41
package/.claude/skills/performance-testing/SKILL.md +200 -546
package/.claude/skills/quality-metrics/SKILL.md +164 -519
package/.claude/skills/refactoring-patterns/SKILL.md +132 -699
package/.claude/skills/regression-testing/SKILL.md +120 -926
package/.claude/skills/risk-based-testing/SKILL.md +157 -660
package/.claude/skills/security-testing/SKILL.md +199 -538
package/.claude/skills/sherlock-review/SKILL.md +163 -699
package/.claude/skills/shift-left-testing/SKILL.md +161 -465
package/.claude/skills/shift-right-testing/SKILL.md +161 -519
package/.claude/skills/six-thinking-hats/SKILL.md +175 -1110
package/.claude/skills/skills-manifest.json +71 -20
package/.claude/skills/tdd-london-chicago/SKILL.md +131 -448
package/.claude/skills/technical-writing/SKILL.md +103 -154
package/.claude/skills/test-automation-strategy/SKILL.md +166 -772
package/.claude/skills/test-data-management/SKILL.md +126 -910
package/.claude/skills/test-design-techniques/SKILL.md +179 -89
package/.claude/skills/test-environment-management/SKILL.md +136 -91
package/.claude/skills/test-reporting-analytics/SKILL.md +169 -92
package/.claude/skills/testability-scoring/SKILL.md +172 -538
package/.claude/skills/testability-scoring/scripts/generate-html-report.js +0 -0
package/.claude/skills/visual-testing-advanced/SKILL.md +155 -78
package/.claude/skills/xp-practices/SKILL.md +151 -587
package/CHANGELOG.md +86 -0
package/README.md +23 -16
package/dist/agents/QXPartnerAgent.d.ts +47 -1
package/dist/agents/QXPartnerAgent.d.ts.map +1 -1
package/dist/agents/QXPartnerAgent.js +2086 -125
package/dist/agents/QXPartnerAgent.js.map +1 -1
package/dist/agents/lifecycle/AgentLifecycleManager.d.ts.map +1 -1
package/dist/agents/lifecycle/AgentLifecycleManager.js +34 -31
package/dist/agents/lifecycle/AgentLifecycleManager.js.map +1 -1
package/dist/cli/commands/init-claude-md-template.d.ts.map +1 -1
package/dist/cli/commands/init-claude-md-template.js +14 -0
package/dist/cli/commands/init-claude-md-template.js.map +1 -1
package/dist/core/SwarmCoordinator.d.ts +180 -0
package/dist/core/SwarmCoordinator.d.ts.map +1 -0
package/dist/core/SwarmCoordinator.js +473 -0
package/dist/core/SwarmCoordinator.js.map +1 -0
package/dist/core/memory/ReflexionMemoryAdapter.d.ts +109 -0
package/dist/core/memory/ReflexionMemoryAdapter.d.ts.map +1 -0
package/dist/core/memory/ReflexionMemoryAdapter.js +306 -0
package/dist/core/memory/ReflexionMemoryAdapter.js.map +1 -0
package/dist/core/memory/RuVectorPatternStore.d.ts +28 -0
package/dist/core/memory/RuVectorPatternStore.d.ts.map +1 -1
package/dist/core/memory/RuVectorPatternStore.js +70 -0
package/dist/core/memory/RuVectorPatternStore.js.map +1 -1
package/dist/core/memory/SparseVectorSearch.d.ts +55 -0
package/dist/core/memory/SparseVectorSearch.d.ts.map +1 -0
package/dist/core/memory/SparseVectorSearch.js +130 -0
package/dist/core/memory/SparseVectorSearch.js.map +1 -0
package/dist/core/memory/TieredCompression.d.ts +81 -0
package/dist/core/memory/TieredCompression.d.ts.map +1 -0
package/dist/core/memory/TieredCompression.js +270 -0
package/dist/core/memory/TieredCompression.js.map +1 -0
package/dist/core/memory/index.d.ts +6 -0
package/dist/core/memory/index.d.ts.map +1 -1
package/dist/core/memory/index.js +29 -1
package/dist/core/memory/index.js.map +1 -1
package/dist/core/metrics/MetricsAggregator.d.ts +228 -0
package/dist/core/metrics/MetricsAggregator.d.ts.map +1 -0
package/dist/core/metrics/MetricsAggregator.js +482 -0
package/dist/core/metrics/MetricsAggregator.js.map +1 -0
package/dist/core/metrics/index.d.ts +5 -0
package/dist/core/metrics/index.d.ts.map +1 -0
package/dist/core/metrics/index.js +11 -0
package/dist/core/metrics/index.js.map +1 -0
package/dist/core/optimization/SwarmOptimizer.d.ts +5 -0
package/dist/core/optimization/SwarmOptimizer.d.ts.map +1 -1
package/dist/core/optimization/SwarmOptimizer.js +17 -0
package/dist/core/optimization/SwarmOptimizer.js.map +1 -1
package/dist/core/orchestration/AdaptiveScheduler.d.ts +190 -0
package/dist/core/orchestration/AdaptiveScheduler.d.ts.map +1 -0
package/dist/core/orchestration/AdaptiveScheduler.js +460 -0
package/dist/core/orchestration/AdaptiveScheduler.js.map +1 -0
package/dist/core/orchestration/WorkflowOrchestrator.d.ts +13 -0
package/dist/core/orchestration/WorkflowOrchestrator.d.ts.map +1 -1
package/dist/core/orchestration/WorkflowOrchestrator.js +32 -0
package/dist/core/orchestration/WorkflowOrchestrator.js.map +1 -1
package/dist/core/recovery/CircuitBreaker.d.ts +176 -0
package/dist/core/recovery/CircuitBreaker.d.ts.map +1 -0
package/dist/core/recovery/CircuitBreaker.js +382 -0
package/dist/core/recovery/CircuitBreaker.js.map +1 -0
package/dist/core/recovery/RecoveryOrchestrator.d.ts +186 -0
package/dist/core/recovery/RecoveryOrchestrator.d.ts.map +1 -0
package/dist/core/recovery/RecoveryOrchestrator.js +476 -0
package/dist/core/recovery/RecoveryOrchestrator.js.map +1 -0
package/dist/core/recovery/RetryStrategy.d.ts +127 -0
package/dist/core/recovery/RetryStrategy.d.ts.map +1 -0
package/dist/core/recovery/RetryStrategy.js +314 -0
package/dist/core/recovery/RetryStrategy.js.map +1 -0
package/dist/core/recovery/index.d.ts +8 -0
package/dist/core/recovery/index.d.ts.map +1 -0
package/dist/core/recovery/index.js +27 -0
package/dist/core/recovery/index.js.map +1 -0
package/dist/core/skills/DependencyResolver.d.ts +99 -0
package/dist/core/skills/DependencyResolver.d.ts.map +1 -0
package/dist/core/skills/DependencyResolver.js +260 -0
package/dist/core/skills/DependencyResolver.js.map +1 -0
package/dist/core/skills/ManifestGenerator.d.ts +114 -0
package/dist/core/skills/ManifestGenerator.d.ts.map +1 -0
package/dist/core/skills/ManifestGenerator.js +449 -0
package/dist/core/skills/ManifestGenerator.js.map +1 -0
package/dist/core/skills/index.d.ts +9 -0
package/dist/core/skills/index.d.ts.map +1 -0
package/dist/core/skills/index.js +24 -0
package/dist/core/skills/index.js.map +1 -0
package/dist/mcp/handlers/chaos/chaos-inject-failure.d.ts +5 -0
package/dist/mcp/handlers/chaos/chaos-inject-failure.d.ts.map +1 -1
package/dist/mcp/handlers/chaos/chaos-inject-failure.js +36 -2
package/dist/mcp/handlers/chaos/chaos-inject-failure.js.map +1 -1
package/dist/mcp/handlers/chaos/chaos-inject-latency.d.ts +5 -0
package/dist/mcp/handlers/chaos/chaos-inject-latency.d.ts.map +1 -1
package/dist/mcp/handlers/chaos/chaos-inject-latency.js +36 -2
package/dist/mcp/handlers/chaos/chaos-inject-latency.js.map +1 -1
package/dist/mcp/server.d.ts +9 -9
package/dist/mcp/server.d.ts.map +1 -1
package/dist/mcp/server.js +1 -2
package/dist/mcp/server.js.map +1 -1
package/dist/types/qx.d.ts +113 -7
package/dist/types/qx.d.ts.map +1 -1
package/dist/types/qx.js.map +1 -1
package/dist/visualization/api/RestEndpoints.js +1 -1
package/dist/visualization/api/RestEndpoints.js.map +1 -1
package/package.json +15 -54

package/.claude/skills/brutal-honesty-review/SKILL.md CHANGED Viewed

@@ -1,153 +1,72 @@
 ---
-name: "Brutal Honesty Review"
+name: brutal-honesty-review
 description: "Unvarnished technical criticism combining Linus Torvalds' precision, Gordon Ramsay's standards, and James Bach's BS-detection. Use when code/tests need harsh reality checks, certification schemes smell fishy, or technical decisions lack rigor. No sugar-coating, just surgical truth about what's broken and why."
+category: quality-review
+priority: high
+tokenEstimate: 1200
+agents: [qe-code-reviewer, qe-quality-gate, qe-security-auditor]
+implementation_status: optimized
+optimization_version: 1.0
+last_optimized: 2025-12-03
+dependencies: []
+quick_reference_card: true
+tags: [code-review, honesty, critical-thinking, technical-criticism, quality]
 ---
 # Brutal Honesty Review
-## What This Skill Does
+<default_to_action>
+When brutal honesty is needed:
+1. CHOOSE MODE: Linus (technical), Ramsay (standards), Bach (BS detection)
+2. VERIFY CONTEXT: Senior engineer? Repeated mistake? Critical bug? Explicit request?
+3. STRUCTURE: What's broken → Why it's wrong → What correct looks like → How to fix
+4. ATTACK THE WORK, not the worker
+5. ALWAYS provide actionable path forward
-Delivers brutally honest technical criticism across three modes, each calibrated to eliminate different types of mediocrity:
+**Quick Mode Selection:**
+- **Linus**: Code is technically wrong, inefficient, misunderstands fundamentals
+- **Ramsay**: Quality is subpar compared to clear excellence model
+- **Bach**: Certifications, best practices, or vendor hype need reality check
-1. **Linus Mode** - Surgical technical precision on code/architecture
-2. **Ramsay Mode** - Standards-driven quality assessment
-3. **Bach Mode** - BS detection in testing practices/certifications
+**Calibration:**
+- Level 1 (Direct): "This approach is fundamentally flawed because..."
+- Level 2 (Harsh): "We've discussed this three times. Why is it back?"
+- Level 3 (Brutal): "This is negligent. You're exposing user data because..."
-Unlike diplomatic reviews, this skill **dissects why something is wrong, explains the correct approach, and has zero patience for repeated mistakes or sloppy thinking**.
+**DO NOT USE FOR:** Junior devs' first PRs, demoralized teams, public forums, low psychological safety
+</default_to_action>
----
-## Prerequisites
-- **Thick skin** - This will hurt
-- **Willingness to learn** - Criticism is actionable, not performative
-- **Context awareness** - Know when harsh honesty helps vs. harms team morale
----
-## When to Use This Skill
-### ✅ APPROPRIATE CONTEXTS:
-- Senior engineers who want unfiltered technical review
-- Teaching moments for patterns that keep recurring
-- Evaluating vendor claims, certification schemes, or industry hype
-- Code that's been "good enough" for too long
-- Teams explicitly asking for no-BS feedback
-- Security vulnerabilities or critical bugs
-- Technical debt requiring executive attention
-### ❌ INAPPROPRIATE CONTEXTS:
-- Junior developers' first contributions
-- Already-demoralized teams
-- Public forums (brutal honesty ≠ public humiliation)
-- When psychological safety is low
-- Performance reviews (unless specifically requested)
----
-## Quick Start
-### Mode 1: Linus (Technical Precision)
-**When**: Code is technically wrong, inefficient, or demonstrates misunderstanding
+## Quick Reference Card
-```bash
-# Usage
-Review this pull request with Linus-level precision:
-[paste code/PR link]
+### When to Use
+| Context | Appropriate? | Why |
+|---------|-------------|-----|
+| Senior engineer code review | ✅ Yes | Can handle directness, respects precision |
+| Repeated architectural mistakes | ✅ Yes | Gentle approaches failed |
+| Security vulnerabilities | ✅ Yes | Stakes too high for sugar-coating |
+| Evaluating vendor claims | ✅ Yes | BS detection prevents expensive mistakes |
+| Junior dev's first PR | ❌ No | Use constructive mentoring |
+| Demoralized team | ❌ No | Will break, not motivate |
+| Public forum | ❌ No | Public humiliation destroys trust |
-# Output style
-"This is completely broken. You're holding the lock for the entire
-I/O operation, which means every thread will serialize on this mutex.
-Did you even test this under load? The correct approach is..."
-```
+### Three Modes
-**Characteristics**:
-- ❌ Eliminates ambiguity about technical standards
-- ✅ Explains WHY it's wrong, not just THAT it's wrong
-- ⚡ Zero tolerance for repeated architectural mistakes
-- 🎯 Focuses on correctness, performance, maintainability
+| Mode | When | Example Output |
+|------|------|----------------|
+| **Linus** | Code technically wrong | "You're holding the lock for the entire I/O. Did you test under load?" |
+| **Ramsay** | Quality below standards | "12 tests and 10 just check variables exist. Where's the business logic?" |
+| **Bach** | BS detection needed | "This cert tests memorization, not bug-finding. Who actually benefits?" |
 ---
-### Mode 2: Ramsay (Standards-Driven)
-**When**: Quality is subpar compared to clear excellence model
-```bash
-# Usage
-Assess this test suite against production standards:
-[paste test code]
-# Output style
-"Look at this! You've got 12 tests and 10 of them are just checking
-if variables exist. Where's the business logic coverage? Where's the
-edge cases? This is RAW. You wouldn't serve this in production, so
-why are you trying to merge it?"
-```
-**Characteristics**:
-- 🔥 Compares reality against clear mental model of excellence
-- 📊 Uses concrete metrics (coverage, complexity, duplication)
-- 🎓 Teaching through high standards, not just criticism
-- 💎 Doesn't just tear down - shows what good looks like
----
-### Mode 3: Bach (BS Detection)
-**When**: Certifications, best practices, or vendor hype need reality check
-```bash
-# Usage
-Evaluate this testing certification/practice/tool claim:
-[paste claim or approach]
-# Output style
-"This certification teaches you to follow scripts, not to think.
-Real testing requires context-driven decisions, not checkbox compliance.
-Does this cert help testers find bugs faster? No. Does it help them
-advocate for quality? No. It helps the certification body make money."
-```
-**Characteristics**:
-- 🚨 Calls out cargo cult practices
-- 🔍 Questions: "Does this actually help?"
-- 📉 Exposes when tools/processes exist for vendor profit, not user benefit
-- 🧠 Promotes critical thinking over certification theater
----
-## Step-by-Step Guide
-### Step 1: Choose Your Mode
-```markdown
-**Linus Mode**: Code/architecture review requiring technical precision
-**Ramsay Mode**: Quality assessment against known standards
-**Bach Mode**: Evaluating practices, certifications, industry claims
-```
-### Step 2: Establish Context
-```markdown
-Before delivering brutal honesty, verify:
-1. **Audience maturity** - Can they handle direct criticism?
-2. **Relationship capital** - Have you earned the right to be harsh?
-3. **Actionability** - Can the recipient actually fix this?
-4. **Intent** - Is this helping them improve or just venting?
-```
-### Step 3: Deliver Structured Criticism
-Each mode follows this template:
+## The Criticism Structure
 ```markdown
 ## What's Broken
-[Surgical description of the problem]
+[Surgical description - specific, technical]
 ## Why It's Wrong
-[Technical/logical explanation, not opinion]
+[Technical explanation, not opinion]
 ## What Correct Looks Like
 [Clear model of excellence]
@@ -159,47 +78,15 @@ Each mode follows this template:
 [Impact if not fixed]
 ```
-### Step 4: Calibrate Harshness
-```markdown
-**Level 1 - Direct** (for experienced engineers):
-"This approach is fundamentally flawed because..."
-**Level 2 - Harsh** (for repeated mistakes):
-"We've discussed this pattern three times. Why is it back?"
-**Level 3 - Brutal** (for critical issues or willful ignorance):
-"This is negligent. You're exposing user data because..."
-```
 ---
-## Mode Details
+## Mode Examples
 ### Linus Mode: Technical Precision
-#### Code Review Pattern
-```markdown
-### 1. Identify Fundamental Flaw
-"You're doing [X], which demonstrates misunderstanding of [concept]."
-### 2. Explain Why It's Wrong
-"This breaks when [scenario] because [technical reason]."
-### 3. Show Correct Approach
-"The correct pattern is [Y] because [reasoning]."
-### 4. Demand Better
-"This should never have passed local testing. Did you run it?"
-```
-#### Example: Concurrency Bug
 ```markdown
 **Problem**: Holding database connection during HTTP call
-**Linus Analysis**:
 "This is completely broken. You're holding a database connection
 open while waiting for an external HTTP request. Under load, you'll
 exhaust the connection pool in seconds.
@@ -215,511 +102,134 @@ The correct approach is:
 This is Connection Management 101. Why wasn't this caught in review?"
 ```
-#### Example: Premature Optimization
-```markdown
-**Problem**: Complex caching for operation that runs once per day
-**Linus Analysis**:
-"You've added 200 lines of caching logic with Redis, LRU eviction,
-and TTL management for a report that generates once daily.
-This is premature optimization. The cure is worse than the disease.
-Measure first. Your 'optimization' added:
-- 3 new failure modes (Redis down, cache corruption, TTL bugs)
-- 10x complexity
-- Zero measurable benefit
-Remove it. If profiling later shows this endpoint is slow, then optimize."
-```
----
 ### Ramsay Mode: Standards-Driven Quality
-#### Test Quality Assessment
-```markdown
-### 1. Compare to Excellence Model
-"Good tests should [criteria]. This test suite [fails at criteria]."
-### 2. Use Concrete Metrics
-"You have 12% branch coverage. Production-ready is 80%+."
-### 3. Show Gap
-"Look at this edge case. It's obvious. Why isn't it tested?"
-### 4. Demand Excellence
-"You know what good looks like. Why didn't you deliver it?"
-```
-#### Example: Weak Test Suite
 ```markdown
 **Problem**: Tests only verify happy path
-**Ramsay Analysis**:
-"Look at this test suite. You've got 15 tests, and 14 of them are
-just happy path scenarios. Where's the validation testing? Where are
-the edge cases? Where's the failure mode testing?
+"Look at this test suite. 15 tests, 14 happy path scenarios.
+Where's the validation testing? Edge cases? Failure modes?
-This is RAW. You're testing if the code runs, not if it's correct.
+This is RAW. You're testing if code runs, not if it's correct.
-A production-ready test suite covers:
+Production-ready covers:
 ✓ Happy path (you have this)
 ✗ Validation failures (missing)
 ✗ Boundary conditions (missing)
 ✗ Error handling (missing)
 ✗ Concurrent access (missing)
-✗ Resource exhaustion (missing)
-You wouldn't ship code with 12% coverage. Don't merge tests with
-12% scenario coverage."
-```
-#### Example: Flaky Tests
-```markdown
-**Problem**: Test suite has intermittent failures
-**Ramsay Analysis**:
-"These tests are FLAKY. Every third run fails because you're using
-setTimeout() and hoping things complete in time. That's not testing,
-that's gambling.
-Flaky tests train developers to ignore failures. That's worse than
-no tests.
-Fix this NOW:
-1. Remove all setTimeout() - use proper async/await
-2. Mock external dependencies - don't test network reliability
-3. Make tests deterministic - same input = same output
-4. If it can't be made stable, DELETE IT
-A flaky test is a broken test. Don't merge broken code."
+You wouldn't ship code with 12% coverage. Don't merge tests
+with 12% scenario coverage."
 ```
----
 ### Bach Mode: BS Detection
-#### Certification Evaluation
 ```markdown
-### 1. Question Core Value
-"Does this [practice/cert/tool] help testers do better work?"
+**Problem**: ISTQB certification required for QE roles
-### 2. Identify Real Beneficiary
-"Who profits from this? Vendor, consultant, or actual testers?"
-### 3. Expose Cargo Cult Thinking
-"This promotes [checkbox compliance] over [critical thinking]."
-### 4. Advocate for Context-Driven Approach
-"Real testing requires [context-specific decisions], not [universal rules]."
-```
-#### Example: Test Automation Tool Hype
-```markdown
-**Problem**: Vendor claims "AI-powered testing eliminates manual testing"
-**Bach Analysis**:
-"This is vendor hype, not reality. They're selling you magic beans.
-'AI-powered' means what exactly? Machine learning model that auto-generates
-tests? Fine. But:
-1. **Who defines test oracles?** (AI can't know business rules)
-2. **Who handles edge cases?** (AI trained on common paths)
-3. **Who questions requirements?** (AI doesn't understand context)
-4. **Who investigates failures?** (AI can't do root cause analysis)
-This tool might auto-generate SOME tests (usually shallow). It won't
-replace thinking testers.
-The real question: Does this tool help YOUR testers on YOUR product
-in YOUR context? If you can't answer that specifically, you're buying
-hype, not tools."
-```
-#### Example: Certification Theater
-```markdown
-**Problem**: Company requires ISTQB certification for QE roles
-**Bach Analysis**:
 "ISTQB tests if you memorized terminology, not if you can test software.
 Real testing skills:
 - Finding bugs others miss
-- Designing effective test strategies for context
+- Designing effective strategies for context
 - Communicating risk to stakeholders
-- Questioning requirements and assumptions
-- Advocating for quality
 ISTQB tests:
-- Definitions of 'alpha testing' vs 'beta testing'
-- Names of test design techniques you'll never use
-- V-model vs Agile terminology
-- Checkbox thinking
-If ISTQB helped testers get better, companies with ISTQB-certified
-teams would ship higher quality. They don't.
+- Definitions of 'alpha' vs 'beta' testing
+- Names of techniques you'll never use
+- V-model terminology
-Want better testers? Hire curious people, give them context, let them
-explore, teach them technical skills. Certification is optional.
-Thinking is mandatory."
+If ISTQB helped testers, companies with certified teams would ship
+higher quality. They don't."
 ```
 ---
 ## Assessment Rubrics
-### Code Quality Rubric (Linus Mode)
+### Code Quality (Linus Mode)
-```markdown
 | Criteria | Failing | Passing | Excellent |
 |----------|---------|---------|-----------|
-| **Correctness** | Wrong algorithm/logic | Works in tested cases | Proven correct across edge cases |
-| **Performance** | Naive O(n²) where O(n) exists | Acceptable complexity | Optimal algorithm + profiled |
-| **Error Handling** | Crashes on invalid input | Returns error codes | Graceful degradation + logging |
-| **Concurrency** | Race conditions present | Thread-safe with locks | Lock-free or proven safe |
-| **Testability** | Impossible to unit test | Can be tested with mocks | Self-testing design |
-| **Maintainability** | "Clever" code | Clear intent | Self-documenting + simple |
-**Passing Threshold**: Minimum "Passing" on all criteria
-**Ship-Ready**: Minimum "Excellent" on Correctness, Performance, Error Handling
-```
+| Correctness | Wrong algorithm | Works in tested cases | Proven across edge cases |
+| Performance | Naive O(n²) | Acceptable complexity | Optimal + profiled |
+| Error Handling | Crashes on invalid | Returns error codes | Graceful degradation |
+| Testability | Impossible to test | Can mock | Self-testing design |
-### Test Quality Rubric (Ramsay Mode)
+### Test Quality (Ramsay Mode)
-```markdown
 | Criteria | Raw | Acceptable | Michelin Star |
 |----------|-----|------------|---------------|
-| **Coverage** | <50% branch | 80%+ branch | 95%+ branch + mutation tested |
-| **Edge Cases** | Only happy path | Common failures | Boundary analysis complete |
-| **Clarity** | What is this testing? | Clear test names | Self-documenting test pyramid |
-| **Speed** | Minutes to run | <10s for unit tests | <1s, parallelized |
-| **Stability** | Flaky (>1% failure) | Stable but slow | Deterministic + fast |
-| **Isolation** | Tests depend on each other | Independent tests | Pure functions, no shared state |
-**Merge Threshold**: Minimum "Acceptable" on all criteria
-**Production-Ready**: Minimum "Michelin Star" on Coverage, Stability, Isolation
-```
+| Coverage | <50% branch | 80%+ branch | 95%+ mutation tested |
+| Edge Cases | Only happy path | Common failures | Boundary analysis complete |
+| Stability | Flaky (>1% failure) | Stable but slow | Deterministic + fast |
-### BS Detection Rubric (Bach Mode)
+### BS Detection (Bach Mode)
-```markdown
 | Red Flag | Evidence | Impact |
 |----------|----------|--------|
-| **Cargo Cult Practice** | "Best practice" with no context | Wasted effort, false confidence |
-| **Certification Theater** | Required cert unrelated to skills | Filters out critical thinkers |
-| **Vendor Lock-In** | Tool solves problem it created | Expensive dependency |
-| **False Automation** | "AI" still needs human verification | Automation debt |
-| **Checkbox Quality** | Compliance without outcomes | Audit passes, customers suffer |
-| **Hype Cycle** | Promises 10x improvement | Budget waste, disillusionment |
-**Green Flag Test**: "Does this help testers/developers do better work in THIS context?"
-```
+| Cargo Cult Practice | "Best practice" with no context | Wasted effort |
+| Certification Theater | Required cert unrelated to skills | Filters out thinkers |
+| Vendor Lock-In | Tool solves problem it created | Expensive dependency |
 ---
-## Calibration Guide
-### When Brutal Honesty Works
-```markdown
-✅ **Senior engineer with ego but skills**
-   → They can handle directness and will respect precision
-✅ **Repeated architectural mistakes**
-   → Gentle approaches failed; escalation needed
-✅ **Critical bug in production code**
-   → Stakes are high; no time for sugar-coating
-✅ **Evaluating vendor claims before purchase**
-   → BS detection prevents expensive mistakes
-✅ **Team explicitly requests no-BS feedback**
-   → They've given permission for harshness
-```
-### When to Dial It Back
-```markdown
-❌ **Junior developer's first PR**
-   → Use constructive mentoring instead
-❌ **Team is already demoralized**
-   → Harsh criticism will break, not motivate
+## Agent Integration
-❌ **Public forum or team meeting**
-   → Public humiliation destroys trust
+```typescript
+// Brutal honesty code review
+await Task("Code Review", {
+  code: pullRequestDiff,
+  mode: 'linus',  // or 'ramsay', 'bach'
+  calibration: 'direct',  // or 'harsh', 'brutal'
+  requireActionable: true
+}, "qe-code-reviewer");
-❌ **Unclear if recipient can fix it**
-   → Frustration without actionability is cruel
-❌ **Personal attack vs. technical criticism**
-   → Never: "You're stupid"
-   → Always: "This approach is flawed because..."
+// BS detection for vendor claims
+await Task("Vendor Evaluation", {
+  claims: vendorMarketingClaims,
+  mode: 'bach',
+  requireEvidence: true
+}, "qe-quality-gate");
 ```
 ---
-## Examples from History
-### Linus Torvalds: Technical Precision
-> **Original Email** (kernel mailing list):
-> "Christ, people. Learn to use git rebase. This merge mess is unreadable.
-> I'm not pulling this garbage until you clean up the history. And don't
-> give me that 'git is hard' excuse - it's your job to know your tools."
-**Why It Worked**:
-- ✅ Clear technical standard (clean git history)
-- ✅ Actionable fix (use rebase)
-- ✅ Audience was experienced kernel developers
-- ✅ Pattern had been explained before
-**When It Backfired**:
-- ❌ Created hostile environment for newcomers
-- ❌ Scared away potential contributors
-- ❌ Linus later acknowledged cost to community
-**Lesson**: Technical precision without empathy scales poorly.
----
-### Gordon Ramsay: Standards-Driven Excellence
+## Agent Coordination Hints
-> **Kitchen Nightmares**:
-> "You've served me frozen ravioli from a bag and tried to pass it off as
-> fresh pasta. Do you think I'm an idiot? Your customers aren't idiots either.
-> You know what fresh pasta tastes like - why are you serving this?"
-**Why It Worked**:
-- ✅ Clear standard (fresh pasta vs. frozen)
-- ✅ Owner had expertise (was trained chef)
-- ✅ Impact was clear (losing customers)
-- ✅ Ramsay showed what excellence looked like (cooked fresh pasta)
-**Structure**:
-1. Identify gap between current and excellent
-2. Question why gap exists (laziness, cost-cutting, ignorance)
-3. Demonstrate excellence
-4. Demand recipient meet the standard they already know
----
-### James Bach: BS Detection in Testing
-> **Blog Post on Test Automation**:
-> "When a vendor tells you their tool 'automates testing,' ask them to define
-> 'testing.' They usually mean 'running checks' - verifying known conditions.
-> Actual testing requires thinking, questioning, exploring. That can't be
-> automated. What they're selling is useful, but it's not testing. Don't
-> let marketing confuse you."
-**Why It Works**:
-- ✅ Clarifies terminology confusion
-- ✅ Exposes economic incentives (vendor profit)
-- ✅ Empowers testers to think critically
-- ✅ Doesn't attack tool, attacks misleading claims
-**Structure**:
-1. Identify the BS claim
-2. Explain why it's misleading
-3. Clarify what's actually true
-4. Advocate for context-driven thinking
----
-## Advanced Patterns
-### Pattern 1: The Technical Breakdown
-**When**: Code demonstrates fundamental misunderstanding
-```markdown
-**Step 1**: Identify the core misunderstanding
-"You're treating this like a single-threaded problem, but it's not."
-**Step 2**: Explain the fundamental concept
-"In concurrent systems, shared mutable state requires synchronization."
-**Step 3**: Show where it breaks
-"When thread A reads X=5, thread B might write X=10 before A completes."
-**Step 4**: Demand better
-"This is concurrency 101. Why wasn't this caught in review?"
+### Memory Namespace
 ```
-### Pattern 2: The Standards Gap
-**When**: Quality is measurably below known standards
-```markdown
-**Step 1**: Establish the standard
-"Production-ready code has 80%+ branch coverage."
-**Step 2**: Measure the gap
-"This has 35% coverage."
-**Step 3**: Show what's missing
-"You're not testing error paths, edge cases, or validation."
-**Step 4**: Demand excellence
-"You know what good looks like. Deliver it."
+aqe/brutal-honesty/
+├── code-reviews/*     - Technical review findings
+├── bs-detection/*     - Vendor/cert evaluations
+└── calibration/*      - Context-appropriate levels
 ```
-### Pattern 3: The BS Detector
-**When**: Claims don't match reality
-```markdown
-**Step 1**: State the claim
-"This certification proves testing competency."
-**Step 2**: Question it
-"Does it? What does it actually test?"
-**Step 3**: Expose the gap
-"It tests memorization, not bug-finding ability."
-**Step 4**: Advocate for reality
-"Want better testers? Measure outcomes, not credentials."
-```
----
-## Troubleshooting
-### Issue: Feedback Feels Personal
-**Symptoms**: Recipient becomes defensive or emotional
-**Cause**: Criticism targeted person instead of work
-**Solution**:
-```markdown
-❌ "You're not thinking about edge cases"
-✅ "This code doesn't handle edge cases because..."
-❌ "You always write flaky tests"
-✅ "These tests are flaky because they depend on timing"
-**Key**: Attack the work, not the worker.
-```
-### Issue: Feedback Isn't Actionable
-**Symptoms**: "This sucks" without explanation
-**Cause**: Missing the "why" and "how to fix"
-**Solution**:
-```markdown
-❌ "This code is terrible"
-✅ "This code is inefficient because [reason]. Fix by [approach]."
-**Structure**:
-1. What's wrong (specific)
-2. Why it's wrong (technical reason)
-3. What correct looks like (model)
-4. How to fix it (actionable)
-```
-### Issue: Calibration is Wrong
-**Symptoms**: Harsh feedback in wrong context (junior dev, demoralized team)
-**Cause**: Forgot to check audience/context
-**Solution**:
-```markdown
-**Before brutal honesty, verify:**
-1. Recipient has skills to fix it
-2. Relationship capital exists
-3. Context allows for directness
-4. Psychological safety is high
-**If any are false, dial back to constructive.**
+### Fleet Coordination
+```typescript
+const reviewFleet = await FleetManager.coordinate({
+  strategy: 'brutal-review',
+  agents: [
+    'qe-code-reviewer',    // Technical precision
+    'qe-security-auditor', // Security brutality
+    'qe-quality-gate'      // Standards enforcement
+  ],
+  topology: 'parallel'
+});
 ```
 ---
 ## Related Skills
-- **[Code Review Quality](../code-review-quality/)** - Diplomatic version of code review
-- **[Context-Driven Testing](../context-driven-testing/)** - Foundation for Bach-mode BS detection
-- **[TDD Red-Green-Refactor](../tdd-london-chicago/)** - Systematic quality approach
-- **[Exploratory Testing](../exploratory-testing-advanced/)** - Critical thinking in testing
+- [code-review-quality](../code-review-quality/) - Diplomatic version
+- [context-driven-testing](../context-driven-testing/) - Foundation for Bach mode
+- [sherlock-review](../sherlock-review/) - Evidence-based investigation
 ---
-## Philosophy
-### Why Brutal Honesty Has a Place
-**1. Eliminates Ambiguity**
-- Diplomatic: "Maybe consider using a different approach?"
-- Brutal: "This approach is wrong because [reason]. Use [correct approach]."
-- **Result**: No confusion about expectations.
-**2. Scales Technical Standards**
-- Gentle mentoring works 1:1, doesn't scale
-- Brutal public technical breakdown teaches entire team
-- **Trade-off**: Works only with psychologically safe, mature teams.
-**3. Cuts Through BS**
-- Certifications, vendor hype, cargo cult practices thrive on politeness
-- Brutal honesty exposes when emperor has no clothes
-- **Result**: Resources spent on what actually helps.
-### The Costs
-**1. Relationship Damage**
-- Harsh criticism without trust destroys collaboration
-- Public brutality creates hostile environment
-- **Mitigation**: Earn relationship capital first.
-**2. Chills Participation**
-- Fear of harsh feedback stops people from contributing
-- Newcomers avoid communities known for brutal feedback
-- **Mitigation**: Reserve brutality for experienced engineers and repeated mistakes.
+## Remember
-**3. Burnout**
-- Constant harsh criticism is exhausting
-- Both giver and receiver pay psychological cost
-- **Mitigation**: Use sparingly, only when necessary.
----
-## The Brutal Honesty Contract
-Before using this skill, establish explicit contract:
-```markdown
-"I'm going to give you unfiltered technical feedback. This will be direct,
-possibly harsh. The goal is clarity, not cruelty. I'll explain:
-1. What's wrong (specifically)
-2. Why it's wrong (technically)
-3. What correct looks like
-4. How to fix it
-If you want diplomatic feedback instead, let me know now."
-```
-**Get explicit consent before proceeding.**
----
+**Brutal honesty eliminates ambiguity but has costs.** Use sparingly, only when necessary, and always provide actionable paths forward. Attack the work, never the worker.
-**Created**: 2025-11-13
-**Category**: Quality Engineering / Code Review
-**Difficulty**: Advanced (requires judgment)
-**Use With Caution**: Can damage morale if misapplied
-**Best For**: Senior engineers, security issues, BS detection
+**The Brutal Honesty Contract**: Get explicit consent. "I'm going to give unfiltered technical feedback. This will be direct, possibly harsh. The goal is clarity, not cruelty."