agentic-qe 2.0.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (144) hide show
  1. package/.claude/agents/qx-partner.md +17 -4
  2. package/.claude/skills/accessibility-testing/SKILL.md +144 -692
  3. package/.claude/skills/agentic-quality-engineering/SKILL.md +176 -529
  4. package/.claude/skills/api-testing-patterns/SKILL.md +180 -560
  5. package/.claude/skills/brutal-honesty-review/SKILL.md +113 -603
  6. package/.claude/skills/bug-reporting-excellence/SKILL.md +116 -517
  7. package/.claude/skills/chaos-engineering-resilience/SKILL.md +127 -72
  8. package/.claude/skills/cicd-pipeline-qe-orchestrator/SKILL.md +209 -404
  9. package/.claude/skills/code-review-quality/SKILL.md +158 -608
  10. package/.claude/skills/compatibility-testing/SKILL.md +148 -38
  11. package/.claude/skills/compliance-testing/SKILL.md +132 -63
  12. package/.claude/skills/consultancy-practices/SKILL.md +114 -446
  13. package/.claude/skills/context-driven-testing/SKILL.md +117 -381
  14. package/.claude/skills/contract-testing/SKILL.md +176 -141
  15. package/.claude/skills/database-testing/SKILL.md +137 -130
  16. package/.claude/skills/exploratory-testing-advanced/SKILL.md +160 -629
  17. package/.claude/skills/holistic-testing-pact/SKILL.md +140 -188
  18. package/.claude/skills/localization-testing/SKILL.md +145 -33
  19. package/.claude/skills/mobile-testing/SKILL.md +132 -448
  20. package/.claude/skills/mutation-testing/SKILL.md +147 -41
  21. package/.claude/skills/performance-testing/SKILL.md +200 -546
  22. package/.claude/skills/quality-metrics/SKILL.md +164 -519
  23. package/.claude/skills/refactoring-patterns/SKILL.md +132 -699
  24. package/.claude/skills/regression-testing/SKILL.md +120 -926
  25. package/.claude/skills/risk-based-testing/SKILL.md +157 -660
  26. package/.claude/skills/security-testing/SKILL.md +199 -538
  27. package/.claude/skills/sherlock-review/SKILL.md +163 -699
  28. package/.claude/skills/shift-left-testing/SKILL.md +161 -465
  29. package/.claude/skills/shift-right-testing/SKILL.md +161 -519
  30. package/.claude/skills/six-thinking-hats/SKILL.md +175 -1110
  31. package/.claude/skills/skills-manifest.json +71 -20
  32. package/.claude/skills/tdd-london-chicago/SKILL.md +131 -448
  33. package/.claude/skills/technical-writing/SKILL.md +103 -154
  34. package/.claude/skills/test-automation-strategy/SKILL.md +166 -772
  35. package/.claude/skills/test-data-management/SKILL.md +126 -910
  36. package/.claude/skills/test-design-techniques/SKILL.md +179 -89
  37. package/.claude/skills/test-environment-management/SKILL.md +136 -91
  38. package/.claude/skills/test-reporting-analytics/SKILL.md +169 -92
  39. package/.claude/skills/testability-scoring/SKILL.md +172 -538
  40. package/.claude/skills/testability-scoring/scripts/generate-html-report.js +0 -0
  41. package/.claude/skills/visual-testing-advanced/SKILL.md +155 -78
  42. package/.claude/skills/xp-practices/SKILL.md +151 -587
  43. package/CHANGELOG.md +86 -0
  44. package/README.md +23 -16
  45. package/dist/agents/QXPartnerAgent.d.ts +47 -1
  46. package/dist/agents/QXPartnerAgent.d.ts.map +1 -1
  47. package/dist/agents/QXPartnerAgent.js +2086 -125
  48. package/dist/agents/QXPartnerAgent.js.map +1 -1
  49. package/dist/agents/lifecycle/AgentLifecycleManager.d.ts.map +1 -1
  50. package/dist/agents/lifecycle/AgentLifecycleManager.js +34 -31
  51. package/dist/agents/lifecycle/AgentLifecycleManager.js.map +1 -1
  52. package/dist/cli/commands/init-claude-md-template.d.ts.map +1 -1
  53. package/dist/cli/commands/init-claude-md-template.js +14 -0
  54. package/dist/cli/commands/init-claude-md-template.js.map +1 -1
  55. package/dist/core/SwarmCoordinator.d.ts +180 -0
  56. package/dist/core/SwarmCoordinator.d.ts.map +1 -0
  57. package/dist/core/SwarmCoordinator.js +473 -0
  58. package/dist/core/SwarmCoordinator.js.map +1 -0
  59. package/dist/core/memory/ReflexionMemoryAdapter.d.ts +109 -0
  60. package/dist/core/memory/ReflexionMemoryAdapter.d.ts.map +1 -0
  61. package/dist/core/memory/ReflexionMemoryAdapter.js +306 -0
  62. package/dist/core/memory/ReflexionMemoryAdapter.js.map +1 -0
  63. package/dist/core/memory/RuVectorPatternStore.d.ts +28 -0
  64. package/dist/core/memory/RuVectorPatternStore.d.ts.map +1 -1
  65. package/dist/core/memory/RuVectorPatternStore.js +70 -0
  66. package/dist/core/memory/RuVectorPatternStore.js.map +1 -1
  67. package/dist/core/memory/SparseVectorSearch.d.ts +55 -0
  68. package/dist/core/memory/SparseVectorSearch.d.ts.map +1 -0
  69. package/dist/core/memory/SparseVectorSearch.js +130 -0
  70. package/dist/core/memory/SparseVectorSearch.js.map +1 -0
  71. package/dist/core/memory/TieredCompression.d.ts +81 -0
  72. package/dist/core/memory/TieredCompression.d.ts.map +1 -0
  73. package/dist/core/memory/TieredCompression.js +270 -0
  74. package/dist/core/memory/TieredCompression.js.map +1 -0
  75. package/dist/core/memory/index.d.ts +6 -0
  76. package/dist/core/memory/index.d.ts.map +1 -1
  77. package/dist/core/memory/index.js +29 -1
  78. package/dist/core/memory/index.js.map +1 -1
  79. package/dist/core/metrics/MetricsAggregator.d.ts +228 -0
  80. package/dist/core/metrics/MetricsAggregator.d.ts.map +1 -0
  81. package/dist/core/metrics/MetricsAggregator.js +482 -0
  82. package/dist/core/metrics/MetricsAggregator.js.map +1 -0
  83. package/dist/core/metrics/index.d.ts +5 -0
  84. package/dist/core/metrics/index.d.ts.map +1 -0
  85. package/dist/core/metrics/index.js +11 -0
  86. package/dist/core/metrics/index.js.map +1 -0
  87. package/dist/core/optimization/SwarmOptimizer.d.ts +5 -0
  88. package/dist/core/optimization/SwarmOptimizer.d.ts.map +1 -1
  89. package/dist/core/optimization/SwarmOptimizer.js +17 -0
  90. package/dist/core/optimization/SwarmOptimizer.js.map +1 -1
  91. package/dist/core/orchestration/AdaptiveScheduler.d.ts +190 -0
  92. package/dist/core/orchestration/AdaptiveScheduler.d.ts.map +1 -0
  93. package/dist/core/orchestration/AdaptiveScheduler.js +460 -0
  94. package/dist/core/orchestration/AdaptiveScheduler.js.map +1 -0
  95. package/dist/core/orchestration/WorkflowOrchestrator.d.ts +13 -0
  96. package/dist/core/orchestration/WorkflowOrchestrator.d.ts.map +1 -1
  97. package/dist/core/orchestration/WorkflowOrchestrator.js +32 -0
  98. package/dist/core/orchestration/WorkflowOrchestrator.js.map +1 -1
  99. package/dist/core/recovery/CircuitBreaker.d.ts +176 -0
  100. package/dist/core/recovery/CircuitBreaker.d.ts.map +1 -0
  101. package/dist/core/recovery/CircuitBreaker.js +382 -0
  102. package/dist/core/recovery/CircuitBreaker.js.map +1 -0
  103. package/dist/core/recovery/RecoveryOrchestrator.d.ts +186 -0
  104. package/dist/core/recovery/RecoveryOrchestrator.d.ts.map +1 -0
  105. package/dist/core/recovery/RecoveryOrchestrator.js +476 -0
  106. package/dist/core/recovery/RecoveryOrchestrator.js.map +1 -0
  107. package/dist/core/recovery/RetryStrategy.d.ts +127 -0
  108. package/dist/core/recovery/RetryStrategy.d.ts.map +1 -0
  109. package/dist/core/recovery/RetryStrategy.js +314 -0
  110. package/dist/core/recovery/RetryStrategy.js.map +1 -0
  111. package/dist/core/recovery/index.d.ts +8 -0
  112. package/dist/core/recovery/index.d.ts.map +1 -0
  113. package/dist/core/recovery/index.js +27 -0
  114. package/dist/core/recovery/index.js.map +1 -0
  115. package/dist/core/skills/DependencyResolver.d.ts +99 -0
  116. package/dist/core/skills/DependencyResolver.d.ts.map +1 -0
  117. package/dist/core/skills/DependencyResolver.js +260 -0
  118. package/dist/core/skills/DependencyResolver.js.map +1 -0
  119. package/dist/core/skills/ManifestGenerator.d.ts +114 -0
  120. package/dist/core/skills/ManifestGenerator.d.ts.map +1 -0
  121. package/dist/core/skills/ManifestGenerator.js +449 -0
  122. package/dist/core/skills/ManifestGenerator.js.map +1 -0
  123. package/dist/core/skills/index.d.ts +9 -0
  124. package/dist/core/skills/index.d.ts.map +1 -0
  125. package/dist/core/skills/index.js +24 -0
  126. package/dist/core/skills/index.js.map +1 -0
  127. package/dist/mcp/handlers/chaos/chaos-inject-failure.d.ts +5 -0
  128. package/dist/mcp/handlers/chaos/chaos-inject-failure.d.ts.map +1 -1
  129. package/dist/mcp/handlers/chaos/chaos-inject-failure.js +36 -2
  130. package/dist/mcp/handlers/chaos/chaos-inject-failure.js.map +1 -1
  131. package/dist/mcp/handlers/chaos/chaos-inject-latency.d.ts +5 -0
  132. package/dist/mcp/handlers/chaos/chaos-inject-latency.d.ts.map +1 -1
  133. package/dist/mcp/handlers/chaos/chaos-inject-latency.js +36 -2
  134. package/dist/mcp/handlers/chaos/chaos-inject-latency.js.map +1 -1
  135. package/dist/mcp/server.d.ts +9 -9
  136. package/dist/mcp/server.d.ts.map +1 -1
  137. package/dist/mcp/server.js +1 -2
  138. package/dist/mcp/server.js.map +1 -1
  139. package/dist/types/qx.d.ts +113 -7
  140. package/dist/types/qx.d.ts.map +1 -1
  141. package/dist/types/qx.js.map +1 -1
  142. package/dist/visualization/api/RestEndpoints.js +1 -1
  143. package/dist/visualization/api/RestEndpoints.js.map +1 -1
  144. package/package.json +15 -54
@@ -1,153 +1,72 @@
1
1
  ---
2
- name: "Brutal Honesty Review"
2
+ name: brutal-honesty-review
3
3
  description: "Unvarnished technical criticism combining Linus Torvalds' precision, Gordon Ramsay's standards, and James Bach's BS-detection. Use when code/tests need harsh reality checks, certification schemes smell fishy, or technical decisions lack rigor. No sugar-coating, just surgical truth about what's broken and why."
4
+ category: quality-review
5
+ priority: high
6
+ tokenEstimate: 1200
7
+ agents: [qe-code-reviewer, qe-quality-gate, qe-security-auditor]
8
+ implementation_status: optimized
9
+ optimization_version: 1.0
10
+ last_optimized: 2025-12-03
11
+ dependencies: []
12
+ quick_reference_card: true
13
+ tags: [code-review, honesty, critical-thinking, technical-criticism, quality]
4
14
  ---
5
15
 
6
16
  # Brutal Honesty Review
7
17
 
8
- ## What This Skill Does
18
+ <default_to_action>
19
+ When brutal honesty is needed:
20
+ 1. CHOOSE MODE: Linus (technical), Ramsay (standards), Bach (BS detection)
21
+ 2. VERIFY CONTEXT: Senior engineer? Repeated mistake? Critical bug? Explicit request?
22
+ 3. STRUCTURE: What's broken → Why it's wrong → What correct looks like → How to fix
23
+ 4. ATTACK THE WORK, not the worker
24
+ 5. ALWAYS provide actionable path forward
9
25
 
10
- Delivers brutally honest technical criticism across three modes, each calibrated to eliminate different types of mediocrity:
26
+ **Quick Mode Selection:**
27
+ - **Linus**: Code is technically wrong, inefficient, misunderstands fundamentals
28
+ - **Ramsay**: Quality is subpar compared to clear excellence model
29
+ - **Bach**: Certifications, best practices, or vendor hype need reality check
11
30
 
12
- 1. **Linus Mode** - Surgical technical precision on code/architecture
13
- 2. **Ramsay Mode** - Standards-driven quality assessment
14
- 3. **Bach Mode** - BS detection in testing practices/certifications
31
+ **Calibration:**
32
+ - Level 1 (Direct): "This approach is fundamentally flawed because..."
33
+ - Level 2 (Harsh): "We've discussed this three times. Why is it back?"
34
+ - Level 3 (Brutal): "This is negligent. You're exposing user data because..."
15
35
 
16
- Unlike diplomatic reviews, this skill **dissects why something is wrong, explains the correct approach, and has zero patience for repeated mistakes or sloppy thinking**.
36
+ **DO NOT USE FOR:** Junior devs' first PRs, demoralized teams, public forums, low psychological safety
37
+ </default_to_action>
17
38
 
18
- ---
19
-
20
- ## Prerequisites
21
-
22
- - **Thick skin** - This will hurt
23
- - **Willingness to learn** - Criticism is actionable, not performative
24
- - **Context awareness** - Know when harsh honesty helps vs. harms team morale
25
-
26
- ---
27
-
28
- ## When to Use This Skill
29
-
30
- ### ✅ APPROPRIATE CONTEXTS:
31
- - Senior engineers who want unfiltered technical review
32
- - Teaching moments for patterns that keep recurring
33
- - Evaluating vendor claims, certification schemes, or industry hype
34
- - Code that's been "good enough" for too long
35
- - Teams explicitly asking for no-BS feedback
36
- - Security vulnerabilities or critical bugs
37
- - Technical debt requiring executive attention
38
-
39
- ### ❌ INAPPROPRIATE CONTEXTS:
40
- - Junior developers' first contributions
41
- - Already-demoralized teams
42
- - Public forums (brutal honesty ≠ public humiliation)
43
- - When psychological safety is low
44
- - Performance reviews (unless specifically requested)
45
-
46
- ---
47
-
48
- ## Quick Start
49
-
50
- ### Mode 1: Linus (Technical Precision)
51
-
52
- **When**: Code is technically wrong, inefficient, or demonstrates misunderstanding
39
+ ## Quick Reference Card
53
40
 
54
- ```bash
55
- # Usage
56
- Review this pull request with Linus-level precision:
57
- [paste code/PR link]
41
+ ### When to Use
42
+ | Context | Appropriate? | Why |
43
+ |---------|-------------|-----|
44
+ | Senior engineer code review | ✅ Yes | Can handle directness, respects precision |
45
+ | Repeated architectural mistakes | ✅ Yes | Gentle approaches failed |
46
+ | Security vulnerabilities | ✅ Yes | Stakes too high for sugar-coating |
47
+ | Evaluating vendor claims | ✅ Yes | BS detection prevents expensive mistakes |
48
+ | Junior dev's first PR | ❌ No | Use constructive mentoring |
49
+ | Demoralized team | ❌ No | Will break, not motivate |
50
+ | Public forum | ❌ No | Public humiliation destroys trust |
58
51
 
59
- # Output style
60
- "This is completely broken. You're holding the lock for the entire
61
- I/O operation, which means every thread will serialize on this mutex.
62
- Did you even test this under load? The correct approach is..."
63
- ```
52
+ ### Three Modes
64
53
 
65
- **Characteristics**:
66
- - ❌ Eliminates ambiguity about technical standards
67
- - Explains WHY it's wrong, not just THAT it's wrong
68
- - Zero tolerance for repeated architectural mistakes
69
- - 🎯 Focuses on correctness, performance, maintainability
54
+ | Mode | When | Example Output |
55
+ |------|------|----------------|
56
+ | **Linus** | Code technically wrong | "You're holding the lock for the entire I/O. Did you test under load?" |
57
+ | **Ramsay** | Quality below standards | "12 tests and 10 just check variables exist. Where's the business logic?" |
58
+ | **Bach** | BS detection needed | "This cert tests memorization, not bug-finding. Who actually benefits?" |
70
59
 
71
60
  ---
72
61
 
73
- ### Mode 2: Ramsay (Standards-Driven)
74
-
75
- **When**: Quality is subpar compared to clear excellence model
76
-
77
- ```bash
78
- # Usage
79
- Assess this test suite against production standards:
80
- [paste test code]
81
-
82
- # Output style
83
- "Look at this! You've got 12 tests and 10 of them are just checking
84
- if variables exist. Where's the business logic coverage? Where's the
85
- edge cases? This is RAW. You wouldn't serve this in production, so
86
- why are you trying to merge it?"
87
- ```
88
-
89
- **Characteristics**:
90
- - 🔥 Compares reality against clear mental model of excellence
91
- - 📊 Uses concrete metrics (coverage, complexity, duplication)
92
- - 🎓 Teaching through high standards, not just criticism
93
- - 💎 Doesn't just tear down - shows what good looks like
94
-
95
- ---
96
-
97
- ### Mode 3: Bach (BS Detection)
98
-
99
- **When**: Certifications, best practices, or vendor hype need reality check
100
-
101
- ```bash
102
- # Usage
103
- Evaluate this testing certification/practice/tool claim:
104
- [paste claim or approach]
105
-
106
- # Output style
107
- "This certification teaches you to follow scripts, not to think.
108
- Real testing requires context-driven decisions, not checkbox compliance.
109
- Does this cert help testers find bugs faster? No. Does it help them
110
- advocate for quality? No. It helps the certification body make money."
111
- ```
112
-
113
- **Characteristics**:
114
- - 🚨 Calls out cargo cult practices
115
- - 🔍 Questions: "Does this actually help?"
116
- - 📉 Exposes when tools/processes exist for vendor profit, not user benefit
117
- - 🧠 Promotes critical thinking over certification theater
118
-
119
- ---
120
-
121
- ## Step-by-Step Guide
122
-
123
- ### Step 1: Choose Your Mode
124
-
125
- ```markdown
126
- **Linus Mode**: Code/architecture review requiring technical precision
127
- **Ramsay Mode**: Quality assessment against known standards
128
- **Bach Mode**: Evaluating practices, certifications, industry claims
129
- ```
130
-
131
- ### Step 2: Establish Context
132
-
133
- ```markdown
134
- Before delivering brutal honesty, verify:
135
- 1. **Audience maturity** - Can they handle direct criticism?
136
- 2. **Relationship capital** - Have you earned the right to be harsh?
137
- 3. **Actionability** - Can the recipient actually fix this?
138
- 4. **Intent** - Is this helping them improve or just venting?
139
- ```
140
-
141
- ### Step 3: Deliver Structured Criticism
142
-
143
- Each mode follows this template:
62
+ ## The Criticism Structure
144
63
 
145
64
  ```markdown
146
65
  ## What's Broken
147
- [Surgical description of the problem]
66
+ [Surgical description - specific, technical]
148
67
 
149
68
  ## Why It's Wrong
150
- [Technical/logical explanation, not opinion]
69
+ [Technical explanation, not opinion]
151
70
 
152
71
  ## What Correct Looks Like
153
72
  [Clear model of excellence]
@@ -159,47 +78,15 @@ Each mode follows this template:
159
78
  [Impact if not fixed]
160
79
  ```
161
80
 
162
- ### Step 4: Calibrate Harshness
163
-
164
- ```markdown
165
- **Level 1 - Direct** (for experienced engineers):
166
- "This approach is fundamentally flawed because..."
167
-
168
- **Level 2 - Harsh** (for repeated mistakes):
169
- "We've discussed this pattern three times. Why is it back?"
170
-
171
- **Level 3 - Brutal** (for critical issues or willful ignorance):
172
- "This is negligent. You're exposing user data because..."
173
- ```
174
-
175
81
  ---
176
82
 
177
- ## Mode Details
83
+ ## Mode Examples
178
84
 
179
85
  ### Linus Mode: Technical Precision
180
86
 
181
- #### Code Review Pattern
182
-
183
- ```markdown
184
- ### 1. Identify Fundamental Flaw
185
- "You're doing [X], which demonstrates misunderstanding of [concept]."
186
-
187
- ### 2. Explain Why It's Wrong
188
- "This breaks when [scenario] because [technical reason]."
189
-
190
- ### 3. Show Correct Approach
191
- "The correct pattern is [Y] because [reasoning]."
192
-
193
- ### 4. Demand Better
194
- "This should never have passed local testing. Did you run it?"
195
- ```
196
-
197
- #### Example: Concurrency Bug
198
-
199
87
  ```markdown
200
88
  **Problem**: Holding database connection during HTTP call
201
89
 
202
- **Linus Analysis**:
203
90
  "This is completely broken. You're holding a database connection
204
91
  open while waiting for an external HTTP request. Under load, you'll
205
92
  exhaust the connection pool in seconds.
@@ -215,511 +102,134 @@ The correct approach is:
215
102
  This is Connection Management 101. Why wasn't this caught in review?"
216
103
  ```
217
104
 
218
- #### Example: Premature Optimization
219
-
220
- ```markdown
221
- **Problem**: Complex caching for operation that runs once per day
222
-
223
- **Linus Analysis**:
224
- "You've added 200 lines of caching logic with Redis, LRU eviction,
225
- and TTL management for a report that generates once daily.
226
-
227
- This is premature optimization. The cure is worse than the disease.
228
-
229
- Measure first. Your 'optimization' added:
230
- - 3 new failure modes (Redis down, cache corruption, TTL bugs)
231
- - 10x complexity
232
- - Zero measurable benefit
233
-
234
- Remove it. If profiling later shows this endpoint is slow, then optimize."
235
- ```
236
-
237
- ---
238
-
239
105
  ### Ramsay Mode: Standards-Driven Quality
240
106
 
241
- #### Test Quality Assessment
242
-
243
- ```markdown
244
- ### 1. Compare to Excellence Model
245
- "Good tests should [criteria]. This test suite [fails at criteria]."
246
-
247
- ### 2. Use Concrete Metrics
248
- "You have 12% branch coverage. Production-ready is 80%+."
249
-
250
- ### 3. Show Gap
251
- "Look at this edge case. It's obvious. Why isn't it tested?"
252
-
253
- ### 4. Demand Excellence
254
- "You know what good looks like. Why didn't you deliver it?"
255
- ```
256
-
257
- #### Example: Weak Test Suite
258
-
259
107
  ```markdown
260
108
  **Problem**: Tests only verify happy path
261
109
 
262
- **Ramsay Analysis**:
263
- "Look at this test suite. You've got 15 tests, and 14 of them are
264
- just happy path scenarios. Where's the validation testing? Where are
265
- the edge cases? Where's the failure mode testing?
110
+ "Look at this test suite. 15 tests, 14 happy path scenarios.
111
+ Where's the validation testing? Edge cases? Failure modes?
266
112
 
267
- This is RAW. You're testing if the code runs, not if it's correct.
113
+ This is RAW. You're testing if code runs, not if it's correct.
268
114
 
269
- A production-ready test suite covers:
115
+ Production-ready covers:
270
116
  ✓ Happy path (you have this)
271
117
  ✗ Validation failures (missing)
272
118
  ✗ Boundary conditions (missing)
273
119
  ✗ Error handling (missing)
274
120
  ✗ Concurrent access (missing)
275
- ✗ Resource exhaustion (missing)
276
-
277
- You wouldn't ship code with 12% coverage. Don't merge tests with
278
- 12% scenario coverage."
279
- ```
280
-
281
- #### Example: Flaky Tests
282
-
283
- ```markdown
284
- **Problem**: Test suite has intermittent failures
285
-
286
- **Ramsay Analysis**:
287
- "These tests are FLAKY. Every third run fails because you're using
288
- setTimeout() and hoping things complete in time. That's not testing,
289
- that's gambling.
290
-
291
- Flaky tests train developers to ignore failures. That's worse than
292
- no tests.
293
-
294
- Fix this NOW:
295
- 1. Remove all setTimeout() - use proper async/await
296
- 2. Mock external dependencies - don't test network reliability
297
- 3. Make tests deterministic - same input = same output
298
- 4. If it can't be made stable, DELETE IT
299
121
 
300
- A flaky test is a broken test. Don't merge broken code."
122
+ You wouldn't ship code with 12% coverage. Don't merge tests
123
+ with 12% scenario coverage."
301
124
  ```
302
125
 
303
- ---
304
-
305
126
  ### Bach Mode: BS Detection
306
127
 
307
- #### Certification Evaluation
308
-
309
128
  ```markdown
310
- ### 1. Question Core Value
311
- "Does this [practice/cert/tool] help testers do better work?"
129
+ **Problem**: ISTQB certification required for QE roles
312
130
 
313
- ### 2. Identify Real Beneficiary
314
- "Who profits from this? Vendor, consultant, or actual testers?"
315
-
316
- ### 3. Expose Cargo Cult Thinking
317
- "This promotes [checkbox compliance] over [critical thinking]."
318
-
319
- ### 4. Advocate for Context-Driven Approach
320
- "Real testing requires [context-specific decisions], not [universal rules]."
321
- ```
322
-
323
- #### Example: Test Automation Tool Hype
324
-
325
- ```markdown
326
- **Problem**: Vendor claims "AI-powered testing eliminates manual testing"
327
-
328
- **Bach Analysis**:
329
- "This is vendor hype, not reality. They're selling you magic beans.
330
-
331
- 'AI-powered' means what exactly? Machine learning model that auto-generates
332
- tests? Fine. But:
333
-
334
- 1. **Who defines test oracles?** (AI can't know business rules)
335
- 2. **Who handles edge cases?** (AI trained on common paths)
336
- 3. **Who questions requirements?** (AI doesn't understand context)
337
- 4. **Who investigates failures?** (AI can't do root cause analysis)
338
-
339
- This tool might auto-generate SOME tests (usually shallow). It won't
340
- replace thinking testers.
341
-
342
- The real question: Does this tool help YOUR testers on YOUR product
343
- in YOUR context? If you can't answer that specifically, you're buying
344
- hype, not tools."
345
- ```
346
-
347
- #### Example: Certification Theater
348
-
349
- ```markdown
350
- **Problem**: Company requires ISTQB certification for QE roles
351
-
352
- **Bach Analysis**:
353
131
  "ISTQB tests if you memorized terminology, not if you can test software.
354
132
 
355
133
  Real testing skills:
356
134
  - Finding bugs others miss
357
- - Designing effective test strategies for context
135
+ - Designing effective strategies for context
358
136
  - Communicating risk to stakeholders
359
- - Questioning requirements and assumptions
360
- - Advocating for quality
361
137
 
362
138
  ISTQB tests:
363
- - Definitions of 'alpha testing' vs 'beta testing'
364
- - Names of test design techniques you'll never use
365
- - V-model vs Agile terminology
366
- - Checkbox thinking
367
-
368
- If ISTQB helped testers get better, companies with ISTQB-certified
369
- teams would ship higher quality. They don't.
139
+ - Definitions of 'alpha' vs 'beta' testing
140
+ - Names of techniques you'll never use
141
+ - V-model terminology
370
142
 
371
- Want better testers? Hire curious people, give them context, let them
372
- explore, teach them technical skills. Certification is optional.
373
- Thinking is mandatory."
143
+ If ISTQB helped testers, companies with certified teams would ship
144
+ higher quality. They don't."
374
145
  ```
375
146
 
376
147
  ---
377
148
 
378
149
  ## Assessment Rubrics
379
150
 
380
- ### Code Quality Rubric (Linus Mode)
151
+ ### Code Quality (Linus Mode)
381
152
 
382
- ```markdown
383
153
  | Criteria | Failing | Passing | Excellent |
384
154
  |----------|---------|---------|-----------|
385
- | **Correctness** | Wrong algorithm/logic | Works in tested cases | Proven correct across edge cases |
386
- | **Performance** | Naive O(n²) where O(n) exists | Acceptable complexity | Optimal algorithm + profiled |
387
- | **Error Handling** | Crashes on invalid input | Returns error codes | Graceful degradation + logging |
388
- | **Concurrency** | Race conditions present | Thread-safe with locks | Lock-free or proven safe |
389
- | **Testability** | Impossible to unit test | Can be tested with mocks | Self-testing design |
390
- | **Maintainability** | "Clever" code | Clear intent | Self-documenting + simple |
391
-
392
- **Passing Threshold**: Minimum "Passing" on all criteria
393
- **Ship-Ready**: Minimum "Excellent" on Correctness, Performance, Error Handling
394
- ```
155
+ | Correctness | Wrong algorithm | Works in tested cases | Proven across edge cases |
156
+ | Performance | Naive O(n²) | Acceptable complexity | Optimal + profiled |
157
+ | Error Handling | Crashes on invalid | Returns error codes | Graceful degradation |
158
+ | Testability | Impossible to test | Can mock | Self-testing design |
395
159
 
396
- ### Test Quality Rubric (Ramsay Mode)
160
+ ### Test Quality (Ramsay Mode)
397
161
 
398
- ```markdown
399
162
  | Criteria | Raw | Acceptable | Michelin Star |
400
163
  |----------|-----|------------|---------------|
401
- | **Coverage** | <50% branch | 80%+ branch | 95%+ branch + mutation tested |
402
- | **Edge Cases** | Only happy path | Common failures | Boundary analysis complete |
403
- | **Clarity** | What is this testing? | Clear test names | Self-documenting test pyramid |
404
- | **Speed** | Minutes to run | <10s for unit tests | <1s, parallelized |
405
- | **Stability** | Flaky (>1% failure) | Stable but slow | Deterministic + fast |
406
- | **Isolation** | Tests depend on each other | Independent tests | Pure functions, no shared state |
407
-
408
- **Merge Threshold**: Minimum "Acceptable" on all criteria
409
- **Production-Ready**: Minimum "Michelin Star" on Coverage, Stability, Isolation
410
- ```
164
+ | Coverage | <50% branch | 80%+ branch | 95%+ mutation tested |
165
+ | Edge Cases | Only happy path | Common failures | Boundary analysis complete |
166
+ | Stability | Flaky (>1% failure) | Stable but slow | Deterministic + fast |
411
167
 
412
- ### BS Detection Rubric (Bach Mode)
168
+ ### BS Detection (Bach Mode)
413
169
 
414
- ```markdown
415
170
  | Red Flag | Evidence | Impact |
416
171
  |----------|----------|--------|
417
- | **Cargo Cult Practice** | "Best practice" with no context | Wasted effort, false confidence |
418
- | **Certification Theater** | Required cert unrelated to skills | Filters out critical thinkers |
419
- | **Vendor Lock-In** | Tool solves problem it created | Expensive dependency |
420
- | **False Automation** | "AI" still needs human verification | Automation debt |
421
- | **Checkbox Quality** | Compliance without outcomes | Audit passes, customers suffer |
422
- | **Hype Cycle** | Promises 10x improvement | Budget waste, disillusionment |
423
-
424
- **Green Flag Test**: "Does this help testers/developers do better work in THIS context?"
425
- ```
172
+ | Cargo Cult Practice | "Best practice" with no context | Wasted effort |
173
+ | Certification Theater | Required cert unrelated to skills | Filters out thinkers |
174
+ | Vendor Lock-In | Tool solves problem it created | Expensive dependency |
426
175
 
427
176
  ---
428
177
 
429
- ## Calibration Guide
430
-
431
- ### When Brutal Honesty Works
432
-
433
- ```markdown
434
- ✅ **Senior engineer with ego but skills**
435
- → They can handle directness and will respect precision
436
-
437
- ✅ **Repeated architectural mistakes**
438
- → Gentle approaches failed; escalation needed
439
-
440
- ✅ **Critical bug in production code**
441
- → Stakes are high; no time for sugar-coating
442
-
443
- ✅ **Evaluating vendor claims before purchase**
444
- → BS detection prevents expensive mistakes
445
-
446
- ✅ **Team explicitly requests no-BS feedback**
447
- → They've given permission for harshness
448
- ```
449
-
450
- ### When to Dial It Back
451
-
452
- ```markdown
453
- ❌ **Junior developer's first PR**
454
- → Use constructive mentoring instead
455
-
456
- ❌ **Team is already demoralized**
457
- → Harsh criticism will break, not motivate
178
+ ## Agent Integration
458
179
 
459
- ❌ **Public forum or team meeting**
460
- Public humiliation destroys trust
180
+ ```typescript
181
+ // Brutal honesty code review
182
+ await Task("Code Review", {
183
+ code: pullRequestDiff,
184
+ mode: 'linus', // or 'ramsay', 'bach'
185
+ calibration: 'direct', // or 'harsh', 'brutal'
186
+ requireActionable: true
187
+ }, "qe-code-reviewer");
461
188
 
462
- **Unclear if recipient can fix it**
463
- Frustration without actionability is cruel
464
-
465
- **Personal attack vs. technical criticism**
466
- → Never: "You're stupid"
467
- Always: "This approach is flawed because..."
189
+ // BS detection for vendor claims
190
+ await Task("Vendor Evaluation", {
191
+ claims: vendorMarketingClaims,
192
+ mode: 'bach',
193
+ requireEvidence: true
194
+ }, "qe-quality-gate");
468
195
  ```
469
196
 
470
197
  ---
471
198
 
472
- ## Examples from History
473
-
474
- ### Linus Torvalds: Technical Precision
475
-
476
- > **Original Email** (kernel mailing list):
477
- > "Christ, people. Learn to use git rebase. This merge mess is unreadable.
478
- > I'm not pulling this garbage until you clean up the history. And don't
479
- > give me that 'git is hard' excuse - it's your job to know your tools."
480
-
481
- **Why It Worked**:
482
- - ✅ Clear technical standard (clean git history)
483
- - ✅ Actionable fix (use rebase)
484
- - ✅ Audience was experienced kernel developers
485
- - ✅ Pattern had been explained before
486
-
487
- **When It Backfired**:
488
- - ❌ Created hostile environment for newcomers
489
- - ❌ Scared away potential contributors
490
- - ❌ Linus later acknowledged cost to community
491
-
492
- **Lesson**: Technical precision without empathy scales poorly.
493
-
494
- ---
495
-
496
- ### Gordon Ramsay: Standards-Driven Excellence
199
+ ## Agent Coordination Hints
497
200
 
498
- > **Kitchen Nightmares**:
499
- > "You've served me frozen ravioli from a bag and tried to pass it off as
500
- > fresh pasta. Do you think I'm an idiot? Your customers aren't idiots either.
501
- > You know what fresh pasta tastes like - why are you serving this?"
502
-
503
- **Why It Worked**:
504
- - ✅ Clear standard (fresh pasta vs. frozen)
505
- - ✅ Owner had expertise (was trained chef)
506
- - ✅ Impact was clear (losing customers)
507
- - ✅ Ramsay showed what excellence looked like (cooked fresh pasta)
508
-
509
- **Structure**:
510
- 1. Identify gap between current and excellent
511
- 2. Question why gap exists (laziness, cost-cutting, ignorance)
512
- 3. Demonstrate excellence
513
- 4. Demand recipient meet the standard they already know
514
-
515
- ---
516
-
517
- ### James Bach: BS Detection in Testing
518
-
519
- > **Blog Post on Test Automation**:
520
- > "When a vendor tells you their tool 'automates testing,' ask them to define
521
- > 'testing.' They usually mean 'running checks' - verifying known conditions.
522
- > Actual testing requires thinking, questioning, exploring. That can't be
523
- > automated. What they're selling is useful, but it's not testing. Don't
524
- > let marketing confuse you."
525
-
526
- **Why It Works**:
527
- - ✅ Clarifies terminology confusion
528
- - ✅ Exposes economic incentives (vendor profit)
529
- - ✅ Empowers testers to think critically
530
- - ✅ Doesn't attack tool, attacks misleading claims
531
-
532
- **Structure**:
533
- 1. Identify the BS claim
534
- 2. Explain why it's misleading
535
- 3. Clarify what's actually true
536
- 4. Advocate for context-driven thinking
537
-
538
- ---
539
-
540
- ## Advanced Patterns
541
-
542
- ### Pattern 1: The Technical Breakdown
543
-
544
- **When**: Code demonstrates fundamental misunderstanding
545
-
546
- ```markdown
547
- **Step 1**: Identify the core misunderstanding
548
- "You're treating this like a single-threaded problem, but it's not."
549
-
550
- **Step 2**: Explain the fundamental concept
551
- "In concurrent systems, shared mutable state requires synchronization."
552
-
553
- **Step 3**: Show where it breaks
554
- "When thread A reads X=5, thread B might write X=10 before A completes."
555
-
556
- **Step 4**: Demand better
557
- "This is concurrency 101. Why wasn't this caught in review?"
201
+ ### Memory Namespace
558
202
  ```
559
-
560
- ### Pattern 2: The Standards Gap
561
-
562
- **When**: Quality is measurably below known standards
563
-
564
- ```markdown
565
- **Step 1**: Establish the standard
566
- "Production-ready code has 80%+ branch coverage."
567
-
568
- **Step 2**: Measure the gap
569
- "This has 35% coverage."
570
-
571
- **Step 3**: Show what's missing
572
- "You're not testing error paths, edge cases, or validation."
573
-
574
- **Step 4**: Demand excellence
575
- "You know what good looks like. Deliver it."
203
+ aqe/brutal-honesty/
204
+ ├── code-reviews/* - Technical review findings
205
+ ├── bs-detection/* - Vendor/cert evaluations
206
+ └── calibration/* - Context-appropriate levels
576
207
  ```
577
208
 
578
- ### Pattern 3: The BS Detector
579
-
580
- **When**: Claims don't match reality
581
-
582
- ```markdown
583
- **Step 1**: State the claim
584
- "This certification proves testing competency."
585
-
586
- **Step 2**: Question it
587
- "Does it? What does it actually test?"
588
-
589
- **Step 3**: Expose the gap
590
- "It tests memorization, not bug-finding ability."
591
-
592
- **Step 4**: Advocate for reality
593
- "Want better testers? Measure outcomes, not credentials."
594
- ```
595
-
596
- ---
597
-
598
- ## Troubleshooting
599
-
600
- ### Issue: Feedback Feels Personal
601
-
602
- **Symptoms**: Recipient becomes defensive or emotional
603
-
604
- **Cause**: Criticism targeted person instead of work
605
-
606
- **Solution**:
607
- ```markdown
608
- ❌ "You're not thinking about edge cases"
609
- ✅ "This code doesn't handle edge cases because..."
610
-
611
- ❌ "You always write flaky tests"
612
- ✅ "These tests are flaky because they depend on timing"
613
-
614
- **Key**: Attack the work, not the worker.
615
- ```
616
-
617
- ### Issue: Feedback Isn't Actionable
618
-
619
- **Symptoms**: "This sucks" without explanation
620
-
621
- **Cause**: Missing the "why" and "how to fix"
622
-
623
- **Solution**:
624
- ```markdown
625
- ❌ "This code is terrible"
626
- ✅ "This code is inefficient because [reason]. Fix by [approach]."
627
-
628
- **Structure**:
629
- 1. What's wrong (specific)
630
- 2. Why it's wrong (technical reason)
631
- 3. What correct looks like (model)
632
- 4. How to fix it (actionable)
633
- ```
634
-
635
- ### Issue: Calibration is Wrong
636
-
637
- **Symptoms**: Harsh feedback in wrong context (junior dev, demoralized team)
638
-
639
- **Cause**: Forgot to check audience/context
640
-
641
- **Solution**:
642
- ```markdown
643
- **Before brutal honesty, verify:**
644
- 1. Recipient has skills to fix it
645
- 2. Relationship capital exists
646
- 3. Context allows for directness
647
- 4. Psychological safety is high
648
-
649
- **If any are false, dial back to constructive.**
209
+ ### Fleet Coordination
210
+ ```typescript
211
+ const reviewFleet = await FleetManager.coordinate({
212
+ strategy: 'brutal-review',
213
+ agents: [
214
+ 'qe-code-reviewer', // Technical precision
215
+ 'qe-security-auditor', // Security brutality
216
+ 'qe-quality-gate' // Standards enforcement
217
+ ],
218
+ topology: 'parallel'
219
+ });
650
220
  ```
651
221
 
652
222
  ---
653
223
 
654
224
  ## Related Skills
655
-
656
- - **[Code Review Quality](../code-review-quality/)** - Diplomatic version of code review
657
- - **[Context-Driven Testing](../context-driven-testing/)** - Foundation for Bach-mode BS detection
658
- - **[TDD Red-Green-Refactor](../tdd-london-chicago/)** - Systematic quality approach
659
- - **[Exploratory Testing](../exploratory-testing-advanced/)** - Critical thinking in testing
225
+ - [code-review-quality](../code-review-quality/) - Diplomatic version
226
+ - [context-driven-testing](../context-driven-testing/) - Foundation for Bach mode
227
+ - [sherlock-review](../sherlock-review/) - Evidence-based investigation
660
228
 
661
229
  ---
662
230
 
663
- ## Philosophy
664
-
665
- ### Why Brutal Honesty Has a Place
666
-
667
- **1. Eliminates Ambiguity**
668
- - Diplomatic: "Maybe consider using a different approach?"
669
- - Brutal: "This approach is wrong because [reason]. Use [correct approach]."
670
- - **Result**: No confusion about expectations.
671
-
672
- **2. Scales Technical Standards**
673
- - Gentle mentoring works 1:1, doesn't scale
674
- - Brutal public technical breakdown teaches entire team
675
- - **Trade-off**: Works only with psychologically safe, mature teams.
676
-
677
- **3. Cuts Through BS**
678
- - Certifications, vendor hype, cargo cult practices thrive on politeness
679
- - Brutal honesty exposes when emperor has no clothes
680
- - **Result**: Resources spent on what actually helps.
681
-
682
- ### The Costs
683
-
684
- **1. Relationship Damage**
685
- - Harsh criticism without trust destroys collaboration
686
- - Public brutality creates hostile environment
687
- - **Mitigation**: Earn relationship capital first.
688
-
689
- **2. Chills Participation**
690
- - Fear of harsh feedback stops people from contributing
691
- - Newcomers avoid communities known for brutal feedback
692
- - **Mitigation**: Reserve brutality for experienced engineers and repeated mistakes.
231
+ ## Remember
693
232
 
694
- **3. Burnout**
695
- - Constant harsh criticism is exhausting
696
- - Both giver and receiver pay psychological cost
697
- - **Mitigation**: Use sparingly, only when necessary.
698
-
699
- ---
700
-
701
- ## The Brutal Honesty Contract
702
-
703
- Before using this skill, establish explicit contract:
704
-
705
- ```markdown
706
- "I'm going to give you unfiltered technical feedback. This will be direct,
707
- possibly harsh. The goal is clarity, not cruelty. I'll explain:
708
-
709
- 1. What's wrong (specifically)
710
- 2. Why it's wrong (technically)
711
- 3. What correct looks like
712
- 4. How to fix it
713
-
714
- If you want diplomatic feedback instead, let me know now."
715
- ```
716
-
717
- **Get explicit consent before proceeding.**
718
-
719
- ---
233
+ **Brutal honesty eliminates ambiguity but has costs.** Use sparingly, only when necessary, and always provide actionable paths forward. Attack the work, never the worker.
720
234
 
721
- **Created**: 2025-11-13
722
- **Category**: Quality Engineering / Code Review
723
- **Difficulty**: Advanced (requires judgment)
724
- **Use With Caution**: Can damage morale if misapplied
725
- **Best For**: Senior engineers, security issues, BS detection
235
+ **The Brutal Honesty Contract**: Get explicit consent. "I'm going to give unfiltered technical feedback. This will be direct, possibly harsh. The goal is clarity, not cruelty."