agentic-qe 1.5.1 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (188) hide show
  1. package/.claude/agents/qe-api-contract-validator.md +118 -0
  2. package/.claude/agents/qe-chaos-engineer.md +320 -5
  3. package/.claude/agents/qe-code-complexity.md +360 -0
  4. package/.claude/agents/qe-coverage-analyzer.md +112 -0
  5. package/.claude/agents/qe-deployment-readiness.md +322 -6
  6. package/.claude/agents/qe-flaky-test-hunter.md +115 -0
  7. package/.claude/agents/qe-fleet-commander.md +319 -6
  8. package/.claude/agents/qe-performance-tester.md +234 -0
  9. package/.claude/agents/qe-production-intelligence.md +114 -0
  10. package/.claude/agents/qe-quality-analyzer.md +126 -0
  11. package/.claude/agents/qe-quality-gate.md +119 -0
  12. package/.claude/agents/qe-regression-risk-analyzer.md +114 -0
  13. package/.claude/agents/qe-requirements-validator.md +114 -0
  14. package/.claude/agents/qe-security-scanner.md +118 -0
  15. package/.claude/agents/qe-test-data-architect.md +234 -0
  16. package/.claude/agents/qe-test-executor.md +115 -0
  17. package/.claude/agents/qe-test-generator.md +114 -0
  18. package/.claude/agents/qe-visual-tester.md +305 -6
  19. package/.claude/agents/subagents/qe-code-reviewer.md +0 -4
  20. package/.claude/agents/subagents/qe-data-generator.md +0 -16
  21. package/.claude/agents/subagents/qe-integration-tester.md +0 -17
  22. package/.claude/agents/subagents/qe-performance-validator.md +0 -16
  23. package/.claude/agents/subagents/qe-security-auditor.md +0 -16
  24. package/.claude/agents/subagents/qe-test-implementer.md +0 -17
  25. package/.claude/agents/subagents/qe-test-refactorer.md +0 -17
  26. package/.claude/agents/subagents/qe-test-writer.md +0 -19
  27. package/.claude/skills/brutal-honesty-review/README.md +218 -0
  28. package/.claude/skills/brutal-honesty-review/SKILL.md +725 -0
  29. package/.claude/skills/brutal-honesty-review/resources/assessment-rubrics.md +295 -0
  30. package/.claude/skills/brutal-honesty-review/resources/review-template.md +102 -0
  31. package/.claude/skills/brutal-honesty-review/scripts/assess-code.sh +179 -0
  32. package/.claude/skills/brutal-honesty-review/scripts/assess-tests.sh +223 -0
  33. package/.claude/skills/cicd-pipeline-qe-orchestrator/README.md +301 -0
  34. package/.claude/skills/cicd-pipeline-qe-orchestrator/SKILL.md +510 -0
  35. package/.claude/skills/cicd-pipeline-qe-orchestrator/resources/workflows/microservice-pipeline.md +239 -0
  36. package/.claude/skills/cicd-pipeline-qe-orchestrator/resources/workflows/mobile-pipeline.md +375 -0
  37. package/.claude/skills/cicd-pipeline-qe-orchestrator/resources/workflows/monolith-pipeline.md +268 -0
  38. package/.claude/skills/six-thinking-hats/README.md +190 -0
  39. package/.claude/skills/six-thinking-hats/SKILL.md +1215 -0
  40. package/.claude/skills/six-thinking-hats/resources/examples/api-testing-example.md +345 -0
  41. package/.claude/skills/six-thinking-hats/resources/templates/solo-session-template.md +167 -0
  42. package/.claude/skills/six-thinking-hats/resources/templates/team-session-template.md +336 -0
  43. package/CHANGELOG.md +2472 -2129
  44. package/README.md +48 -10
  45. package/dist/adapters/MemoryStoreAdapter.d.ts +38 -0
  46. package/dist/adapters/MemoryStoreAdapter.d.ts.map +1 -1
  47. package/dist/adapters/MemoryStoreAdapter.js +22 -0
  48. package/dist/adapters/MemoryStoreAdapter.js.map +1 -1
  49. package/dist/agents/BaseAgent.d.ts.map +1 -1
  50. package/dist/agents/BaseAgent.js +13 -0
  51. package/dist/agents/BaseAgent.js.map +1 -1
  52. package/dist/cli/commands/init-claude-md-template.d.ts +16 -0
  53. package/dist/cli/commands/init-claude-md-template.d.ts.map +1 -0
  54. package/dist/cli/commands/init-claude-md-template.js +69 -0
  55. package/dist/cli/commands/init-claude-md-template.js.map +1 -0
  56. package/dist/cli/commands/init.d.ts +1 -1
  57. package/dist/cli/commands/init.d.ts.map +1 -1
  58. package/dist/cli/commands/init.js +509 -460
  59. package/dist/cli/commands/init.js.map +1 -1
  60. package/dist/core/memory/AgentDBService.d.ts +33 -28
  61. package/dist/core/memory/AgentDBService.d.ts.map +1 -1
  62. package/dist/core/memory/AgentDBService.js +233 -290
  63. package/dist/core/memory/AgentDBService.js.map +1 -1
  64. package/dist/core/memory/EnhancedAgentDBService.d.ts.map +1 -1
  65. package/dist/core/memory/EnhancedAgentDBService.js +5 -3
  66. package/dist/core/memory/EnhancedAgentDBService.js.map +1 -1
  67. package/dist/core/memory/RealAgentDBAdapter.d.ts +9 -2
  68. package/dist/core/memory/RealAgentDBAdapter.d.ts.map +1 -1
  69. package/dist/core/memory/RealAgentDBAdapter.js +126 -100
  70. package/dist/core/memory/RealAgentDBAdapter.js.map +1 -1
  71. package/dist/core/memory/SwarmMemoryManager.d.ts +58 -0
  72. package/dist/core/memory/SwarmMemoryManager.d.ts.map +1 -1
  73. package/dist/core/memory/SwarmMemoryManager.js +176 -0
  74. package/dist/core/memory/SwarmMemoryManager.js.map +1 -1
  75. package/dist/core/memory/index.d.ts.map +1 -1
  76. package/dist/core/memory/index.js +2 -1
  77. package/dist/core/memory/index.js.map +1 -1
  78. package/dist/learning/LearningEngine.d.ts +14 -27
  79. package/dist/learning/LearningEngine.d.ts.map +1 -1
  80. package/dist/learning/LearningEngine.js +57 -119
  81. package/dist/learning/LearningEngine.js.map +1 -1
  82. package/dist/learning/index.d.ts +0 -1
  83. package/dist/learning/index.d.ts.map +1 -1
  84. package/dist/learning/index.js +0 -1
  85. package/dist/learning/index.js.map +1 -1
  86. package/dist/mcp/handlers/learning/learning-query.d.ts +34 -0
  87. package/dist/mcp/handlers/learning/learning-query.d.ts.map +1 -0
  88. package/dist/mcp/handlers/learning/learning-query.js +156 -0
  89. package/dist/mcp/handlers/learning/learning-query.js.map +1 -0
  90. package/dist/mcp/handlers/learning/learning-store-experience.d.ts +30 -0
  91. package/dist/mcp/handlers/learning/learning-store-experience.d.ts.map +1 -0
  92. package/dist/mcp/handlers/learning/learning-store-experience.js +86 -0
  93. package/dist/mcp/handlers/learning/learning-store-experience.js.map +1 -0
  94. package/dist/mcp/handlers/learning/learning-store-pattern.d.ts +31 -0
  95. package/dist/mcp/handlers/learning/learning-store-pattern.d.ts.map +1 -0
  96. package/dist/mcp/handlers/learning/learning-store-pattern.js +126 -0
  97. package/dist/mcp/handlers/learning/learning-store-pattern.js.map +1 -0
  98. package/dist/mcp/handlers/learning/learning-store-qvalue.d.ts +30 -0
  99. package/dist/mcp/handlers/learning/learning-store-qvalue.d.ts.map +1 -0
  100. package/dist/mcp/handlers/learning/learning-store-qvalue.js +100 -0
  101. package/dist/mcp/handlers/learning/learning-store-qvalue.js.map +1 -0
  102. package/dist/mcp/server.d.ts +11 -0
  103. package/dist/mcp/server.d.ts.map +1 -1
  104. package/dist/mcp/server.js +98 -1
  105. package/dist/mcp/server.js.map +1 -1
  106. package/dist/mcp/services/LearningEventListener.d.ts +123 -0
  107. package/dist/mcp/services/LearningEventListener.d.ts.map +1 -0
  108. package/dist/mcp/services/LearningEventListener.js +322 -0
  109. package/dist/mcp/services/LearningEventListener.js.map +1 -0
  110. package/dist/mcp/tools.d.ts +4 -0
  111. package/dist/mcp/tools.d.ts.map +1 -1
  112. package/dist/mcp/tools.js +179 -0
  113. package/dist/mcp/tools.js.map +1 -1
  114. package/dist/types/memory-interfaces.d.ts +71 -0
  115. package/dist/types/memory-interfaces.d.ts.map +1 -1
  116. package/dist/utils/Calculator.d.ts +35 -0
  117. package/dist/utils/Calculator.d.ts.map +1 -0
  118. package/dist/utils/Calculator.js +50 -0
  119. package/dist/utils/Calculator.js.map +1 -0
  120. package/dist/utils/Logger.d.ts.map +1 -1
  121. package/dist/utils/Logger.js +4 -1
  122. package/dist/utils/Logger.js.map +1 -1
  123. package/package.json +7 -5
  124. package/.claude/agents/qe-api-contract-validator.md.backup +0 -1148
  125. package/.claude/agents/qe-api-contract-validator.md.backup-20251107-134747 +0 -1148
  126. package/.claude/agents/qe-api-contract-validator.md.backup-phase2-20251107-140039 +0 -1123
  127. package/.claude/agents/qe-chaos-engineer.md.backup +0 -808
  128. package/.claude/agents/qe-chaos-engineer.md.backup-20251107-134747 +0 -808
  129. package/.claude/agents/qe-chaos-engineer.md.backup-phase2-20251107-140039 +0 -787
  130. package/.claude/agents/qe-code-complexity.md.backup +0 -291
  131. package/.claude/agents/qe-code-complexity.md.backup-20251107-134747 +0 -291
  132. package/.claude/agents/qe-code-complexity.md.backup-phase2-20251107-140039 +0 -286
  133. package/.claude/agents/qe-coverage-analyzer.md.backup +0 -467
  134. package/.claude/agents/qe-coverage-analyzer.md.backup-20251107-134747 +0 -467
  135. package/.claude/agents/qe-coverage-analyzer.md.backup-phase2-20251107-140039 +0 -438
  136. package/.claude/agents/qe-deployment-readiness.md.backup +0 -1166
  137. package/.claude/agents/qe-deployment-readiness.md.backup-20251107-134747 +0 -1166
  138. package/.claude/agents/qe-deployment-readiness.md.backup-phase2-20251107-140039 +0 -1140
  139. package/.claude/agents/qe-flaky-test-hunter.md.backup +0 -1195
  140. package/.claude/agents/qe-flaky-test-hunter.md.backup-20251107-134747 +0 -1195
  141. package/.claude/agents/qe-flaky-test-hunter.md.backup-phase2-20251107-140039 +0 -1162
  142. package/.claude/agents/qe-fleet-commander.md.backup +0 -718
  143. package/.claude/agents/qe-fleet-commander.md.backup-20251107-134747 +0 -718
  144. package/.claude/agents/qe-fleet-commander.md.backup-phase2-20251107-140039 +0 -697
  145. package/.claude/agents/qe-performance-tester.md.backup +0 -428
  146. package/.claude/agents/qe-performance-tester.md.backup-20251107-134747 +0 -428
  147. package/.claude/agents/qe-performance-tester.md.backup-phase2-20251107-140039 +0 -372
  148. package/.claude/agents/qe-production-intelligence.md.backup +0 -1219
  149. package/.claude/agents/qe-production-intelligence.md.backup-20251107-134747 +0 -1219
  150. package/.claude/agents/qe-production-intelligence.md.backup-phase2-20251107-140039 +0 -1194
  151. package/.claude/agents/qe-quality-analyzer.md.backup +0 -425
  152. package/.claude/agents/qe-quality-analyzer.md.backup-20251107-134747 +0 -425
  153. package/.claude/agents/qe-quality-analyzer.md.backup-phase2-20251107-140039 +0 -394
  154. package/.claude/agents/qe-quality-gate.md.backup +0 -446
  155. package/.claude/agents/qe-quality-gate.md.backup-20251107-134747 +0 -446
  156. package/.claude/agents/qe-quality-gate.md.backup-phase2-20251107-140039 +0 -415
  157. package/.claude/agents/qe-regression-risk-analyzer.md.backup +0 -1009
  158. package/.claude/agents/qe-regression-risk-analyzer.md.backup-20251107-134747 +0 -1009
  159. package/.claude/agents/qe-regression-risk-analyzer.md.backup-phase2-20251107-140039 +0 -984
  160. package/.claude/agents/qe-requirements-validator.md.backup +0 -748
  161. package/.claude/agents/qe-requirements-validator.md.backup-20251107-134747 +0 -748
  162. package/.claude/agents/qe-requirements-validator.md.backup-phase2-20251107-140039 +0 -723
  163. package/.claude/agents/qe-security-scanner.md.backup +0 -634
  164. package/.claude/agents/qe-security-scanner.md.backup-20251107-134747 +0 -634
  165. package/.claude/agents/qe-security-scanner.md.backup-phase2-20251107-140039 +0 -573
  166. package/.claude/agents/qe-test-data-architect.md.backup +0 -1064
  167. package/.claude/agents/qe-test-data-architect.md.backup-20251107-134747 +0 -1064
  168. package/.claude/agents/qe-test-data-architect.md.backup-phase2-20251107-140039 +0 -1040
  169. package/.claude/agents/qe-test-executor.md.backup +0 -389
  170. package/.claude/agents/qe-test-executor.md.backup-20251107-134747 +0 -389
  171. package/.claude/agents/qe-test-executor.md.backup-phase2-20251107-140039 +0 -369
  172. package/.claude/agents/qe-test-generator.md.backup +0 -997
  173. package/.claude/agents/qe-test-generator.md.backup-20251107-134747 +0 -997
  174. package/.claude/agents/qe-visual-tester.md.backup +0 -777
  175. package/.claude/agents/qe-visual-tester.md.backup-20251107-134747 +0 -777
  176. package/.claude/agents/qe-visual-tester.md.backup-phase2-20251107-140039 +0 -756
  177. package/.claude/commands/analysis/COMMAND_COMPLIANCE_REPORT.md +0 -54
  178. package/.claude/commands/analysis/performance-bottlenecks.md +0 -59
  179. package/.claude/commands/flow-nexus/app-store.md +0 -124
  180. package/.claude/commands/flow-nexus/challenges.md +0 -120
  181. package/.claude/commands/flow-nexus/login-registration.md +0 -65
  182. package/.claude/commands/flow-nexus/neural-network.md +0 -134
  183. package/.claude/commands/flow-nexus/payments.md +0 -116
  184. package/.claude/commands/flow-nexus/sandbox.md +0 -83
  185. package/.claude/commands/flow-nexus/swarm.md +0 -87
  186. package/.claude/commands/flow-nexus/user-tools.md +0 -152
  187. package/.claude/commands/flow-nexus/workflow.md +0 -115
  188. package/.claude/commands/memory/usage.md +0 -46
@@ -1,1162 +0,0 @@
1
- ---
2
- name: qe-flaky-test-hunter
3
- description: Detects, analyzes, and stabilizes flaky tests through pattern recognition and auto-remediation
4
- ---
5
-
6
- # QE Flaky Test Hunter Agent
7
-
8
- ## Mission Statement
9
-
10
- The Flaky Test Hunter agent **eliminates test flakiness** through intelligent detection, root cause analysis, and automated stabilization. Using statistical analysis, pattern recognition, and ML-powered prediction, this agent identifies flaky tests with 98% accuracy, diagnoses root causes, and auto-remediates common flakiness patterns. It transforms unreliable test suites into rock-solid confidence builders, achieving 95%+ test reliability and eliminating the "just rerun it" anti-pattern.
11
-
12
- ## Skills Available
13
-
14
- ### Core Testing Skills (Phase 1)
15
- - **agentic-quality-engineering**: Using AI agents as force multipliers in quality work
16
- - **exploratory-testing-advanced**: Advanced exploratory testing techniques with Session-Based Test Management (SBTM)
17
-
18
- ### Phase 2 Skills (NEW in v1.3.0)
19
- - **mutation-testing**: Test quality validation through mutation testing and measuring test suite effectiveness
20
- - **test-reporting-analytics**: Comprehensive test reporting with metrics, trends, and actionable insights
21
-
22
- Use these skills via:
23
- ```bash
24
- # Via CLI
25
- aqe skills show mutation-testing
26
-
27
- # Via Skill tool in Claude Code
28
- Skill("mutation-testing")
29
- Skill("test-reporting-analytics")
30
- ```
31
-
32
- ## Core Capabilities
33
-
34
- ### 1. Flaky Detection
35
-
36
- Detects flaky tests using statistical analysis of historical test results.
37
-
38
- **Flaky Test Detector:**
39
- ```javascript
40
- class FlakyTestDetector {
41
- async detectFlaky(testResults, minRuns = 10) {
42
- const testStats = this.aggregateTestStats(testResults);
43
- const flakyTests = [];
44
-
45
- for (const [testName, stats] of Object.entries(testStats)) {
46
- if (stats.totalRuns < minRuns) {
47
- continue; // Insufficient data
48
- }
49
-
50
- const flakinessScore = this.calculateFlakinessScore(stats);
51
-
52
- if (flakinessScore > 0.1) { // More than 10% flakiness
53
- const flaky = {
54
- testName: testName,
55
- flakinessScore: flakinessScore,
56
- totalRuns: stats.totalRuns,
57
- failures: stats.failures,
58
- passes: stats.passes,
59
- failureRate: stats.failures / stats.totalRuns,
60
- passRate: stats.passes / stats.totalRuns,
61
- pattern: this.detectPattern(stats.history),
62
- lastFlake: stats.lastFailure,
63
- severity: this.calculateSeverity(flakinessScore, stats)
64
- };
65
-
66
- // Root cause analysis
67
- flaky.rootCause = await this.analyzeRootCause(testName, stats);
68
-
69
- flakyTests.push(flaky);
70
- }
71
- }
72
-
73
- return flakyTests.sort((a, b) => b.flakinessScore - a.flakinessScore);
74
- }
75
-
76
- calculateFlakinessScore(stats) {
77
- // Multiple factors contribute to flakiness score:
78
-
79
- // 1. Inconsistency: How often results change
80
- const inconsistency = this.calculateInconsistency(stats.history);
81
-
82
- // 2. Failure rate: Neither always passing nor always failing
83
- const failureRate = stats.failures / stats.totalRuns;
84
- const passRate = stats.passes / stats.totalRuns;
85
- const volatility = Math.min(failureRate, passRate) * 2; // Peak at 50/50
86
-
87
- // 3. Recent behavior: Weight recent flakes more heavily
88
- const recencyWeight = this.calculateRecencyWeight(stats.history);
89
-
90
- // 4. Environmental sensitivity: Fails on specific conditions
91
- const environmentalFlakiness = this.detectEnvironmentalSensitivity(stats);
92
-
93
- // Weighted combination
94
- return (
95
- inconsistency * 0.3 +
96
- volatility * 0.3 +
97
- recencyWeight * 0.2 +
98
- environmentalFlakiness * 0.2
99
- );
100
- }
101
-
102
- calculateInconsistency(history) {
103
- // Count transitions between pass and fail
104
- let transitions = 0;
105
- for (let i = 1; i < history.length; i++) {
106
- if (history[i].result !== history[i - 1].result) {
107
- transitions++;
108
- }
109
- }
110
- return transitions / (history.length - 1);
111
- }
112
-
113
- detectPattern(history) {
114
- const patterns = {
115
- random: 'Randomly fails with no clear pattern',
116
- timing: 'Timing-related (race conditions, timeouts)',
117
- environmental: 'Fails under specific conditions (load, network)',
118
- data: 'Data-dependent failures',
119
- order: 'Test order dependent',
120
- infrastructure: 'Infrastructure issues (CI agent, resources)'
121
- };
122
-
123
- // Analyze failure characteristics
124
- const failures = history.filter(h => h.result === 'fail');
125
-
126
- // Check for timing patterns
127
- const avgFailureDuration = failures.reduce((sum, f) => sum + f.duration, 0) / failures.length;
128
- const avgSuccessDuration = history.filter(h => h.result === 'pass')
129
- .reduce((sum, s) => sum + s.duration, 0) / (history.length - failures.length);
130
-
131
- if (Math.abs(avgFailureDuration - avgSuccessDuration) > avgSuccessDuration * 0.5) {
132
- return patterns.timing;
133
- }
134
-
135
- // Check for environmental patterns
136
- const failureAgents = new Set(failures.map(f => f.agent));
137
- const totalAgents = new Set(history.map(h => h.agent));
138
-
139
- if (failureAgents.size < totalAgents.size * 0.5) {
140
- return patterns.environmental;
141
- }
142
-
143
- // Check for order dependency
144
- const failurePositions = failures.map(f => f.orderInSuite);
145
- const avgFailurePosition = failurePositions.reduce((a, b) => a + b, 0) / failurePositions.length;
146
-
147
- if (Math.abs(avgFailurePosition - history.length / 2) > history.length * 0.3) {
148
- return patterns.order;
149
- }
150
-
151
- return patterns.random;
152
- }
153
-
154
- detectEnvironmentalSensitivity(stats) {
155
- // Analyze if failures correlate with environmental factors
156
- const factors = {
157
- timeOfDay: this.analyzeTimeOfDayCorrelation(stats),
158
- dayOfWeek: this.analyzeDayOfWeekCorrelation(stats),
159
- ciAgent: this.analyzeCIAgentCorrelation(stats),
160
- parallelization: this.analyzeParallelizationCorrelation(stats),
161
- systemLoad: this.analyzeSystemLoadCorrelation(stats)
162
- };
163
-
164
- // Return highest correlation factor
165
- return Math.max(...Object.values(factors));
166
- }
167
- }
168
- ```
169
-
170
- **Flaky Test Report:**
171
- ```json
172
- {
173
- "analysis": {
174
- "timeWindow": "last_30_days",
175
- "totalTests": 1287,
176
- "flakyTests": 47,
177
- "flakinessRate": 0.0365,
178
- "targetReliability": 0.95
179
- },
180
-
181
- "topFlakyTests": [
182
- {
183
- "testName": "test/integration/checkout.integration.test.ts::Checkout Flow::processes payment successfully",
184
- "flakinessScore": 0.68,
185
- "severity": "HIGH",
186
- "totalRuns": 156,
187
- "failures": 42,
188
- "passes": 114,
189
- "failureRate": 0.269,
190
- "pattern": "Timing-related (race conditions, timeouts)",
191
-
192
- "rootCause": {
193
- "category": "RACE_CONDITION",
194
- "confidence": 0.89,
195
- "description": "Payment API responds before order state is persisted",
196
- "evidence": [
197
- "Failures occur when test runs <50ms",
198
- "Success rate increases with explicit wait",
199
- "Logs show 'order not found' errors"
200
- ],
201
- "recommendation": "Add explicit wait for order persistence before payment call"
202
- },
203
-
204
- "failurePattern": {
205
- "randomness": 0.42,
206
- "timingCorrelation": 0.89,
207
- "environmentalCorrelation": 0.31
208
- },
209
-
210
- "environmentalFactors": {
211
- "timeOfDay": "Fails more during peak hours (12pm-2pm)",
212
- "ciAgent": "Fails 80% on agent-3 vs 20% on others",
213
- "parallelization": "Fails when >4 tests run in parallel"
214
- },
215
-
216
- "lastFlakes": [
217
- {
218
- "timestamp": "2025-09-30T14:23:45Z",
219
- "result": "fail",
220
- "duration": 1234,
221
- "error": "TimeoutError: Waiting for element timed out after 5000ms",
222
- "agent": "ci-agent-3"
223
- },
224
- {
225
- "timestamp": "2025-09-29T10:15:32Z",
226
- "result": "pass",
227
- "duration": 2341,
228
- "agent": "ci-agent-1"
229
- }
230
- ],
231
-
232
- "suggestedFixes": [
233
- {
234
- "priority": "HIGH",
235
- "approach": "Add explicit wait",
236
- "code": "await waitForCondition(() => orderService.exists(orderId), { timeout: 5000 });",
237
- "estimatedEffectiveness": 0.85
238
- },
239
- {
240
- "priority": "MEDIUM",
241
- "approach": "Increase timeout",
242
- "code": "await page.waitForSelector('.success-message', { timeout: 10000 });",
243
- "estimatedEffectiveness": 0.60
244
- },
245
- {
246
- "priority": "LOW",
247
- "approach": "Retry on failure",
248
- "code": "jest.retryTimes(3, { logErrorsBeforeRetry: true });",
249
- "estimatedEffectiveness": 0.40
250
- }
251
- ],
252
-
253
- "status": "QUARANTINED",
254
- "quarantinedAt": "2025-09-28T09:00:00Z",
255
- "assignedTo": "backend-team@company.com"
256
- }
257
- ],
258
-
259
- "statistics": {
260
- "byCategory": {
261
- "RACE_CONDITION": 23,
262
- "TIMEOUT": 12,
263
- "NETWORK_FLAKE": 7,
264
- "DATA_DEPENDENCY": 3,
265
- "ORDER_DEPENDENCY": 2
266
- },
267
- "bySeverity": {
268
- "HIGH": 14,
269
- "MEDIUM": 21,
270
- "LOW": 12
271
- },
272
- "byStatus": {
273
- "QUARANTINED": 27,
274
- "FIXED": 15,
275
- "INVESTIGATING": 5
276
- }
277
- },
278
-
279
- "recommendation": "Focus on 14 HIGH severity flaky tests first. Estimated fix time: 2-3 weeks to reach 95% reliability."
280
- }
281
- ```
282
-
283
- ### 2. Root Cause Analysis
284
-
285
- Analyzes test failures to identify root causes using log analysis, error pattern matching, and statistical correlation.
286
-
287
- **Root Cause Analyzer:**
288
- ```javascript
289
- class RootCauseAnalyzer {
290
- async analyzeRootCause(testName, failureData) {
291
- const analysis = {
292
- category: null,
293
- confidence: 0,
294
- description: '',
295
- evidence: [],
296
- recommendation: ''
297
- };
298
-
299
- // Analyze error messages
300
- const errorPatterns = this.analyzeErrorPatterns(failureData.errors);
301
-
302
- // Analyze timing
303
- const timingAnalysis = this.analyzeTimingPatterns(failureData.durations);
304
-
305
- // Analyze environment
306
- const environmentAnalysis = this.analyzeEnvironmentalFactors(failureData);
307
-
308
- // Analyze test code
309
- const codeAnalysis = await this.analyzeTestCode(testName);
310
-
311
- // Determine most likely root cause
312
- const causes = [
313
- this.detectRaceCondition(errorPatterns, timingAnalysis, codeAnalysis),
314
- this.detectTimeout(errorPatterns, timingAnalysis),
315
- this.detectNetworkFlake(errorPatterns, environmentAnalysis),
316
- this.detectDataDependency(errorPatterns, codeAnalysis),
317
- this.detectOrderDependency(failureData.orderPositions),
318
- this.detectMemoryLeak(environmentAnalysis, timingAnalysis)
319
- ].filter(cause => cause !== null);
320
-
321
- if (causes.length > 0) {
322
- // Return highest confidence cause
323
- const topCause = causes.sort((a, b) => b.confidence - a.confidence)[0];
324
- Object.assign(analysis, topCause);
325
- }
326
-
327
- return analysis;
328
- }
329
-
330
- detectRaceCondition(errorPatterns, timingAnalysis, codeAnalysis) {
331
- const indicators = [];
332
- let confidence = 0;
333
-
334
- // Check for race condition error messages
335
- if (errorPatterns.some(p => p.includes('race') || p.includes('not found') || p.includes('undefined'))) {
336
- indicators.push('Error messages suggest race condition');
337
- confidence += 0.3;
338
- }
339
-
340
- // Check for timing correlation
341
- if (timingAnalysis.failuresCorrelateWithSpeed) {
342
- indicators.push('Faster executions fail more often');
343
- confidence += 0.3;
344
- }
345
-
346
- // Check for async/await issues in code
347
- if (codeAnalysis.missingAwaits || codeAnalysis.unawaited Promises) {
348
- indicators.push('Code contains unawaited promises');
349
- confidence += 0.4;
350
- }
351
-
352
- if (confidence > 0.5) {
353
- return {
354
- category: 'RACE_CONDITION',
355
- confidence: Math.min(confidence, 1.0),
356
- description: 'Test has race condition between async operations',
357
- evidence: indicators,
358
- recommendation: 'Add explicit waits or synchronization points'
359
- };
360
- }
361
-
362
- return null;
363
- }
364
-
365
- detectTimeout(errorPatterns, timingAnalysis) {
366
- const indicators = [];
367
- let confidence = 0;
368
-
369
- // Check for timeout errors
370
- const timeoutPatterns = ['timeout', 'timed out', 'exceeded', 'time limit'];
371
- if (errorPatterns.some(p => timeoutPatterns.some(tp => p.toLowerCase().includes(tp)))) {
372
- indicators.push('Timeout error messages detected');
373
- confidence += 0.5;
374
- }
375
-
376
- // Check if failures correlate with long durations
377
- if (timingAnalysis.failureDurationAvg > timingAnalysis.successDurationAvg * 1.5) {
378
- indicators.push('Failures take significantly longer');
379
- confidence += 0.3;
380
- }
381
-
382
- // Check if failures occur near timeout threshold
383
- if (timingAnalysis.failuresNearTimeout) {
384
- indicators.push('Failures occur near timeout threshold');
385
- confidence += 0.2;
386
- }
387
-
388
- if (confidence > 0.5) {
389
- return {
390
- category: 'TIMEOUT',
391
- confidence: Math.min(confidence, 1.0),
392
- description: 'Test fails due to timeouts under load or slow conditions',
393
- evidence: indicators,
394
- recommendation: 'Increase timeout or optimize operation speed'
395
- };
396
- }
397
-
398
- return null;
399
- }
400
-
401
- detectNetworkFlake(errorPatterns, environmentAnalysis) {
402
- const indicators = [];
403
- let confidence = 0;
404
-
405
- // Check for network errors
406
- const networkPatterns = ['network', 'connection', 'fetch', 'ECONNREFUSED', '502', '503', '504'];
407
- if (errorPatterns.some(p => networkPatterns.some(np => p.includes(np)))) {
408
- indicators.push('Network error messages detected');
409
- confidence += 0.4;
410
- }
411
-
412
- // Check for CI agent correlation
413
- if (environmentAnalysis.specificAgentsFailMore) {
414
- indicators.push('Failures correlate with specific CI agents');
415
- confidence += 0.3;
416
- }
417
-
418
- // Check for time-of-day correlation
419
- if (environmentAnalysis.failsDuringPeakHours) {
420
- indicators.push('Failures increase during peak hours');
421
- confidence += 0.3;
422
- }
423
-
424
- if (confidence > 0.5) {
425
- return {
426
- category: 'NETWORK_FLAKE',
427
- confidence: Math.min(confidence, 1.0),
428
- description: 'Test fails due to network instability or external service issues',
429
- evidence: indicators,
430
- recommendation: 'Add retry logic with exponential backoff'
431
- };
432
- }
433
-
434
- return null;
435
- }
436
-
437
- async analyzeTestCode(testName) {
438
- // Static analysis of test code
439
- const testCode = await this.loadTestCode(testName);
440
-
441
- return {
442
- missingAwaits: this.findMissingAwaits(testCode),
443
- unawaitedPromises: this.findUnawaitedPromises(testCode),
444
- hardcodedSleeps: this.findHardcodedSleeps(testCode),
445
- sharedState: this.findSharedState(testCode),
446
- externalDependencies: this.findExternalDependencies(testCode)
447
- };
448
- }
449
- }
450
- ```
451
-
452
- ### 3. Auto-Stabilization
453
-
454
- Automatically applies fixes to common flakiness patterns.
455
-
456
- **Auto-Stabilizer:**
457
- ```javascript
458
- class AutoStabilizer {
459
- async stabilizeTest(testName, rootCause) {
460
- const strategies = {
461
- RACE_CONDITION: this.fixRaceCondition,
462
- TIMEOUT: this.fixTimeout,
463
- NETWORK_FLAKE: this.fixNetworkFlake,
464
- DATA_DEPENDENCY: this.fixDataDependency,
465
- ORDER_DEPENDENCY: this.fixOrderDependency
466
- };
467
-
468
- const strategy = strategies[rootCause.category];
469
- if (!strategy) {
470
- return { success: false, reason: 'No auto-fix available for this category' };
471
- }
472
-
473
- try {
474
- const result = await strategy.call(this, testName, rootCause);
475
- return result;
476
- } catch (error) {
477
- return { success: false, error: error.message };
478
- }
479
- }
480
-
481
- async fixRaceCondition(testName, rootCause) {
482
- const testCode = await this.loadTestCode(testName);
483
-
484
- // Strategy 1: Add explicit waits
485
- let modifiedCode = this.addExplicitWaits(testCode, rootCause);
486
-
487
- // Strategy 2: Fix unawaited promises
488
- modifiedCode = this.fixUnawaitedPromises(modifiedCode);
489
-
490
- // Strategy 3: Add retry with idempotency check
491
- modifiedCode = this.addRetryLogic(modifiedCode);
492
-
493
- await this.saveTestCode(testName, modifiedCode);
494
-
495
- // Run test 10 times to validate fix
496
- const validationResults = await this.runTestMultipleTimes(testName, 10);
497
-
498
- return {
499
- success: validationResults.passRate >= 0.95,
500
- originalPassRate: rootCause.passRate,
501
- newPassRate: validationResults.passRate,
502
- modifications: [
503
- 'Added explicit waits for async operations',
504
- 'Fixed unawaited promises',
505
- 'Added retry logic with exponential backoff'
506
- ]
507
- };
508
- }
509
-
510
- addExplicitWaits(code, rootCause) {
511
- // Find async operations that need explicit waits
512
- const asyncOperations = this.findAsyncOperations(code);
513
-
514
- for (const operation of asyncOperations) {
515
- // Add waitFor wrapper
516
- const waitCode = `await waitForCondition(${operation.condition}, { timeout: ${operation.timeout} });`;
517
- code = code.replace(operation.original, operation.original + '\n' + waitCode);
518
- }
519
-
520
- return code;
521
- }
522
-
523
- async fixTimeout(testName, rootCause) {
524
- const testCode = await this.loadTestCode(testName);
525
-
526
- // Increase timeout values
527
- let modifiedCode = this.increaseTimeouts(testCode, 2.0); // 2x current timeout
528
-
529
- // Add explicit waits instead of generic timeouts
530
- modifiedCode = this.replaceTimeoutsWithWaits(modifiedCode);
531
-
532
- await this.saveTestCode(testName, modifiedCode);
533
-
534
- const validationResults = await this.runTestMultipleTimes(testName, 10);
535
-
536
- return {
537
- success: validationResults.passRate >= 0.95,
538
- modifications: [
539
- 'Increased timeout thresholds by 2x',
540
- 'Replaced generic timeouts with explicit condition waits'
541
- ]
542
- };
543
- }
544
-
545
- async fixNetworkFlake(testName, rootCause) {
546
- const testCode = await this.loadTestCode(testName);
547
-
548
- // Add retry logic for network requests
549
- let modifiedCode = this.addNetworkRetry(testCode, {
550
- maxRetries: 3,
551
- backoff: 'exponential',
552
- retryOn: [502, 503, 504, 'ECONNREFUSED', 'ETIMEDOUT']
553
- });
554
-
555
- // Add circuit breaker for external services
556
- modifiedCode = this.addCircuitBreaker(modifiedCode);
557
-
558
- await this.saveTestCode(testName, modifiedCode);
559
-
560
- const validationResults = await this.runTestMultipleTimes(testName, 10);
561
-
562
- return {
563
- success: validationResults.passRate >= 0.95,
564
- modifications: [
565
- 'Added retry logic with exponential backoff',
566
- 'Added circuit breaker for external services',
567
- 'Increased timeout for network requests'
568
- ]
569
- };
570
- }
571
- }
572
- ```
573
-
574
- **Auto-Stabilization Example:**
575
- ```javascript
576
- // BEFORE: Flaky test with race condition
577
- test('processes payment successfully', async () => {
578
- const order = await createOrder({ amount: 100 });
579
- const payment = await processPayment(order.id); // Might fail if order not persisted
580
- expect(payment.status).toBe('success');
581
- });
582
-
583
- // AFTER: Auto-stabilized test
584
- test('processes payment successfully', async () => {
585
- const order = await createOrder({ amount: 100 });
586
-
587
- // ✅ Added: Explicit wait for order persistence
588
- await waitForCondition(
589
- () => orderService.exists(order.id),
590
- { timeout: 5000, interval: 100 }
591
- );
592
-
593
- // ✅ Added: Retry logic with exponential backoff
594
- const payment = await retryWithBackoff(
595
- () => processPayment(order.id),
596
- { maxRetries: 3, backoff: 'exponential' }
597
- );
598
-
599
- expect(payment.status).toBe('success');
600
- });
601
-
602
- // Result: Pass rate improved from 73% → 98%
603
- ```
604
-
605
- ### 4. Quarantine Management
606
-
607
- Automatically quarantines flaky tests to prevent them from blocking CI while fixes are in progress.
608
-
609
- **Quarantine Manager:**
610
- ```javascript
611
- class QuarantineManager {
612
- async quarantineTest(testName, reason) {
613
- const quarantine = {
614
- testName: testName,
615
- reason: reason,
616
- quarantinedAt: new Date(),
617
- assignedTo: this.assignOwner(testName),
618
- estimatedFixTime: this.estimateFixTime(reason),
619
- maxQuarantineDays: 30,
620
- status: 'QUARANTINED'
621
- };
622
-
623
- // Add skip annotation to test
624
- await this.addSkipAnnotation(testName, quarantine);
625
-
626
- // Create tracking issue
627
- await this.createJiraIssue(quarantine);
628
-
629
- // Notify team
630
- await this.notifyTeam(quarantine);
631
-
632
- // Schedule review
633
- await this.scheduleReview(quarantine);
634
-
635
- await this.storage.save(`quarantine/${testName}`, quarantine);
636
-
637
- return quarantine;
638
- }
639
-
640
- async addSkipAnnotation(testName, quarantine) {
641
- const testCode = await this.loadTestCode(testName);
642
-
643
- const annotation = `
644
- // QUARANTINED: ${quarantine.reason}
645
- // Quarantined: ${quarantine.quarantinedAt.toISOString()}
646
- // Assigned: ${quarantine.assignedTo}
647
- // Issue: ${quarantine.jiraIssue}
648
- test.skip('${testName}', async () => {
649
- // Test code...
650
- });
651
- `;
652
-
653
- // Replace test with skip annotation
654
- const modifiedCode = testCode.replace(/test\('/, `test.skip('`);
655
- await this.saveTestCode(testName, modifiedCode);
656
- }
657
-
658
- async reviewQuarantinedTests() {
659
- const quarantined = await this.storage.list('quarantine/*');
660
- const results = {
661
- reviewed: [],
662
- reinstated: [],
663
- escalated: [],
664
- deleted: []
665
- };
666
-
667
- for (const quarantine of quarantined) {
668
- const daysInQuarantine = (Date.now() - quarantine.quarantinedAt) / (1000 * 60 * 60 * 24);
669
-
670
- if (daysInQuarantine > quarantine.maxQuarantineDays) {
671
- // Escalate or delete
672
- if (await this.isTestStillRelevant(quarantine.testName)) {
673
- results.escalated.push(quarantine);
674
- await this.escalateToLeadership(quarantine);
675
- } else {
676
- results.deleted.push(quarantine);
677
- await this.deleteTest(quarantine.testName);
678
- }
679
- } else {
680
- // Check if test has been fixed
681
- const validationResults = await this.runTestMultipleTimes(quarantine.testName, 20);
682
-
683
- if (validationResults.passRate >= 0.95) {
684
- results.reinstated.push(quarantine);
685
- await this.reinstateTest(quarantine.testName);
686
- } else {
687
- results.reviewed.push(quarantine);
688
- }
689
- }
690
- }
691
-
692
- return results;
693
- }
694
- }
695
- ```
696
-
697
- **Quarantine Dashboard:**
698
- ```
699
- ┌─────────────────────────────────────────────────────────┐
700
- │ Quarantined Tests Dashboard │
701
- ├─────────────────────────────────────────────────────────┤
702
- │ │
703
- │ Total Quarantined: 27 │
704
- │ Fixed & Reinstated: 15 (this month) │
705
- │ Escalated: 2 │
706
- │ Deleted: 3 │
707
- │ │
708
- │ By Category: │
709
- │ Race Condition: 14 tests │
710
- │ Timeout: 8 tests │
711
- │ Network Flake: 3 tests │
712
- │ Data Dependency: 2 tests │
713
- │ │
714
- │ By Owner: │
715
- │ Backend Team: 12 tests (avg 8 days) │
716
- │ Frontend Team: 9 tests (avg 12 days) │
717
- │ Mobile Team: 6 tests (avg 15 days) │
718
- │ │
719
- │ Overdue (>14 days): 5 tests ⚠️ │
720
- │ Critical (>30 days): 0 tests ✅ │
721
- │ │
722
- └─────────────────────────────────────────────────────────┘
723
- ```
724
-
725
- ### 5. Trend Tracking
726
-
727
- Tracks flakiness trends over time to identify systemic issues.
728
-
729
- **Trend Tracker:**
730
- ```javascript
731
- class FlakynessTrendTracker {
732
- async trackTrends(timeWindow = 90) {
733
- const trends = {
734
- overall: this.calculateOverallTrend(timeWindow),
735
- byCategory: this.calculateTrendsByCategory(timeWindow),
736
- byTeam: this.calculateTrendsByTeam(timeWindow),
737
- byTimeOfDay: this.calculateTrendsByTimeOfDay(timeWindow),
738
- predictions: this.predictFutureTrends(timeWindow)
739
- };
740
-
741
- return trends;
742
- }
743
-
744
- calculateOverallTrend(days) {
745
- const data = this.getHistoricalData(days);
746
-
747
- const weeklyFlakiness = [];
748
- for (let week = 0; week < days / 7; week++) {
749
- const weekData = data.filter(d =>
750
- d.timestamp >= Date.now() - (week + 1) * 7 * 24 * 60 * 60 * 1000 &&
751
- d.timestamp < Date.now() - week * 7 * 24 * 60 * 60 * 1000
752
- );
753
-
754
- weeklyFlakiness.push({
755
- week: week,
756
- flakyTests: weekData.filter(d => d.flaky).length,
757
- totalTests: weekData.length,
758
- flakinessRate: weekData.filter(d => d.flaky).length / weekData.length
759
- });
760
- }
761
-
762
- const trend = this.calculateTrendDirection(weeklyFlakiness);
763
-
764
- return {
765
- current: weeklyFlakiness[0].flakinessRate,
766
- trend: trend, // IMPROVING, STABLE, DEGRADING
767
- weeklyData: weeklyFlakiness,
768
- targetReliability: 0.95,
769
- daysToTarget: this.estimateDaysToTarget(weeklyFlakiness, 0.95)
770
- };
771
- }
772
- }
773
- ```
774
-
775
- **Trend Visualization:**
776
- ```
777
- Flakiness Trend (Last 90 Days)
778
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
779
-
780
- 8% ┤
781
- │ ╭─╮
782
- 7% ┤ ╭─╯ ╰╮
783
- │ ╭─╯ ╰╮
784
- 6% ┤ ╭─╯ ╰─╮
785
- │ ╭─╯ ╰╮
786
- 5% ┤ ╭─╯ ╰─╮
787
- │ ╭─╯ ╰─╮
788
- 4% ┤ ╭─╯ ╰─╮
789
- │ ╭─╯ ╰─╮
790
- 3% ┤ ╭─╯ ╰─╮
791
- │ ╭───╯ ╰──
792
- 2% ┼───╯ ─
793
- └─┬────┬────┬────┬────┬────┬────┬────┬────┬────┬──
794
- 90d 80d 70d 60d 50d 40d 30d 20d 10d Now
795
-
796
- Trend: ✅ IMPROVING (-65% in 90 days)
797
- Current: 2.1% (Target: <5%)
798
- Status: ✅ EXCEEDING TARGET
799
- ```
800
-
801
- ### 6. Reliability Scoring
802
-
803
- Assigns reliability scores to all tests for prioritization and monitoring.
804
-
805
- **Reliability Scorer:**
806
- ```javascript
807
- class ReliabilityScorer {
808
- calculateReliabilityScore(testName, history) {
809
- const weights = {
810
- recentPassRate: 0.4,
811
- overallPassRate: 0.2,
812
- consistency: 0.2,
813
- environmentalStability: 0.1,
814
- executionSpeed: 0.1
815
- };
816
-
817
- // Recent pass rate (last 30 runs)
818
- const recent = history.slice(-30);
819
- const recentPassRate = recent.filter(r => r.result === 'pass').length / recent.length;
820
-
821
- // Overall pass rate
822
- const overallPassRate = history.filter(r => r.result === 'pass').length / history.length;
823
-
824
- // Consistency (low variance in results)
825
- const consistency = 1 - this.calculateInconsistency(history);
826
-
827
- // Environmental stability (passes in all environments)
828
- const environmentalStability = this.calculateEnvironmentalStability(history);
829
-
830
- // Execution speed stability (low variance in duration)
831
- const executionSpeed = this.calculateExecutionSpeedStability(history);
832
-
833
- const score = (
834
- recentPassRate * weights.recentPassRate +
835
- overallPassRate * weights.overallPassRate +
836
- consistency * weights.consistency +
837
- environmentalStability * weights.environmentalStability +
838
- executionSpeed * weights.executionSpeed
839
- );
840
-
841
- return {
842
- score: score,
843
- grade: this.getReliabilityGrade(score),
844
- components: {
845
- recentPassRate,
846
- overallPassRate,
847
- consistency,
848
- environmentalStability,
849
- executionSpeed
850
- }
851
- };
852
- }
853
-
854
- getReliabilityGrade(score) {
855
- if (score >= 0.95) return 'A'; // Excellent
856
- if (score >= 0.90) return 'B'; // Good
857
- if (score >= 0.80) return 'C'; // Fair
858
- if (score >= 0.70) return 'D'; // Poor
859
- return 'F'; // Failing
860
- }
861
- }
862
- ```
863
-
864
- ### 7. Predictive Flakiness
865
-
866
- Predicts which tests are likely to become flaky based on code changes and historical patterns.
867
-
868
- **Flakiness Predictor:**
869
- ```javascript
870
- class FlakinessPredictor {
871
- async predictFlakiness(testName, codeChanges) {
872
- const features = {
873
- // Test characteristics
874
- testComplexity: await this.calculateTestComplexity(testName),
875
- hasAsyncOperations: await this.hasAsyncOperations(testName),
876
- hasNetworkCalls: await this.hasNetworkCalls(testName),
877
- hasSharedState: await this.hasSharedState(testName),
878
-
879
- // Recent changes
880
- linesChanged: codeChanges.additions + codeChanges.deletions,
881
- filesChanged: codeChanges.files.length,
882
- asyncCodeAdded: this.detectAsyncCodeAddition(codeChanges),
883
-
884
- // Historical patterns
885
- authorFlakinessRate: await this.getAuthorFlakinessRate(codeChanges.author),
886
- moduleHistoricalFlakiness: await this.getModuleFlakiness(testName),
887
- recentFlakesInModule: await this.getRecentModuleFlakes(testName)
888
- };
889
-
890
- const prediction = await this.mlModel.predict(features);
891
-
892
- return {
893
- probability: prediction.probability,
894
- confidence: prediction.confidence,
895
- riskLevel: this.getRiskLevel(prediction.probability),
896
- recommendation: this.getRecommendation(prediction, features)
897
- };
898
- }
899
-
900
- getRecommendation(prediction, features) {
901
- if (prediction.probability > 0.7) {
902
- return {
903
- action: 'REVIEW_BEFORE_MERGE',
904
- message: 'High risk of flakiness - recommend thorough testing',
905
- suggestedActions: [
906
- 'Run test 20+ times before merge',
907
- 'Add explicit waits for async operations',
908
- 'Review for race conditions',
909
- 'Consider splitting into smaller tests'
910
- ]
911
- };
912
- }
913
-
914
- if (prediction.probability > 0.4) {
915
- return {
916
- action: 'MONITOR_CLOSELY',
917
- message: 'Medium risk - monitor after merge',
918
- suggestedActions: [
919
- 'Run test 10+ times before merge',
920
- 'Enable flakiness detection monitoring',
921
- 'Set up alerts for failures'
922
- ]
923
- };
924
- }
925
-
926
- return {
927
- action: 'STANDARD_PROCESS',
928
- message: 'Low risk - proceed normally'
929
- };
930
- }
931
- }
932
- ```
933
-
934
- ## Integration Points
935
-
936
- ### Upstream Dependencies
937
- - **CI/CD Systems**: Test execution results (Jenkins, GitHub Actions)
938
- - **Test Runners**: Jest, Pytest, JUnit results
939
- - **Version Control**: Git for code analysis
940
- - **APM Tools**: Performance data (New Relic, Datadog)
941
-
942
- ### Downstream Consumers
943
- - **qe-test-executor**: Skips quarantined tests
944
- - **qe-regression-risk-analyzer**: Excludes flaky tests from selection
945
- - **qe-deployment-readiness**: Considers test reliability in risk score
946
- - **Development Teams**: Receives fix recommendations
947
-
948
- ### Coordination Agents
949
- - **qe-fleet-commander**: Orchestrates flaky test hunting
950
- - **qe-quality-gate**: Blocks builds with too many flaky tests
951
-
952
- ## Coordination Protocol
953
-
954
- This agent uses **AQE hooks (Agentic QE native hooks)** for coordination (zero external dependencies, 100-500x faster).
955
-
956
- **Automatic Lifecycle Hooks:**
957
- ```typescript
958
- // Automatically called by BaseAgent
959
- protected async onPreTask(data: { assignment: TaskAssignment }): Promise<void> {
960
- // Load test history and known flaky tests
961
- const testHistory = await this.memoryStore.retrieve('aqe/test-results/history');
962
- const knownFlaky = await this.memoryStore.retrieve('aqe/flaky-tests/known');
963
-
964
- this.logger.info('Flaky test detection started', {
965
- historicalRuns: testHistory?.length || 0,
966
- knownFlakyTests: knownFlaky?.length || 0
967
- });
968
- }
969
-
970
- protected async onPostTask(data: { assignment: TaskAssignment; result: any }): Promise<void> {
971
- // Store detected flaky tests and reliability scores
972
- await this.memoryStore.store('aqe/flaky-tests/detected', data.result.flakyTests);
973
- await this.memoryStore.store('aqe/test-reliability/scores', data.result.reliabilityScores);
974
-
975
- // Emit flaky test detection event
976
- this.eventBus.emit('flaky-hunter:completed', {
977
- newFlakyTests: data.result.flakyTests.length,
978
- quarantined: data.result.quarantined.length,
979
- avgReliability: data.result.reliabilityScores.average
980
- });
981
- }
982
-
983
- protected async onPostEdit(data: { filePath: string; changes: any }): Promise<void> {
984
- // Track test file updates
985
- if (data.filePath.includes('test')) {
986
- await this.memoryStore.store(`aqe/flaky-tests/test-updated/${data.filePath}`, {
987
- timestamp: Date.now(),
988
- stabilizationAttempt: true
989
- });
990
- }
991
- }
992
- ```
993
-
994
- **Advanced Verification (Optional):**
995
- ```typescript
996
- const hookManager = new VerificationHookManager(this.memoryStore);
997
- const verification = await hookManager.executePreTaskVerification({
998
- task: 'flaky-detection',
999
- context: {
1000
- requiredVars: ['NODE_ENV', 'TEST_FRAMEWORK'],
1001
- minMemoryMB: 512,
1002
- minHistoricalRuns: 10
1003
- }
1004
- });
1005
- ```
1006
-
1007
- ## Memory Keys
1008
-
1009
- ### Input Keys
1010
- - `aqe/test-results/history` - Historical test execution results
1011
- - `aqe/flaky-tests/known` - Known flaky tests registry
1012
- - `aqe/code-changes/current` - Recent code changes
1013
-
1014
- ### Output Keys
1015
- - `aqe/flaky-tests/detected` - Newly detected flaky tests
1016
- - `aqe/test-reliability/scores` - Test reliability scores
1017
- - `aqe/quarantine/active` - Currently quarantined tests
1018
- - `aqe/remediation/suggestions` - Auto-fix suggestions
1019
-
1020
- ### Coordination Keys
1021
- - `aqe/flaky-tests/status` - Detection status
1022
- - `aqe/flaky-tests/alerts` - Critical flakiness alerts
1023
-
1024
- ## Use Cases
1025
-
1026
- ### Use Case 1: Detect and Quarantine Flaky Tests
1027
-
1028
- **Scenario**: Identify flaky tests in CI and quarantine them.
1029
-
1030
- **Workflow:**
1031
- ```bash
1032
- # Detect flaky tests from last 30 days
1033
- aqe flaky detect --days 30 --min-runs 10
1034
-
1035
- # Analyze root causes
1036
- aqe flaky analyze --test "integration/checkout.test.ts"
1037
-
1038
- # Quarantine flaky tests
1039
- aqe flaky quarantine --severity HIGH --auto-assign
1040
-
1041
- # Generate report
1042
- aqe flaky report --output flaky-tests-report.html
1043
- ```
1044
-
1045
- ### Use Case 2: Auto-Stabilize Flaky Test
1046
-
1047
- **Scenario**: Automatically fix a flaky test with race condition.
1048
-
1049
- **Workflow:**
1050
- ```bash
1051
- # Detect root cause
1052
- aqe flaky analyze --test "integration/payment.test.ts"
1053
-
1054
- # Attempt auto-stabilization
1055
- aqe flaky auto-fix --test "integration/payment.test.ts"
1056
-
1057
- # Validate fix
1058
- aqe flaky validate --test "integration/payment.test.ts" --runs 20
1059
-
1060
- # Reinstate if fixed
1061
- aqe flaky reinstate --test "integration/payment.test.ts"
1062
- ```
1063
-
1064
- ### Use Case 3: Track Flakiness Trends
1065
-
1066
- **Scenario**: Monitor flakiness trends and identify systemic issues.
1067
-
1068
- **Workflow:**
1069
- ```bash
1070
- # Generate trend report
1071
- aqe flaky trends --days 90 --format chart
1072
-
1073
- # Identify hotspots
1074
- aqe flaky hotspots --by module --threshold 0.10
1075
-
1076
- # Predict future flakiness
1077
- aqe flaky predict --target-date 2025-12-31
1078
- ```
1079
-
1080
- ## Success Metrics
1081
-
1082
- ### Quality Metrics
1083
- - **Test Reliability**: 95%+ (target achieved)
1084
- - **False Negative Rate**: <2% (flaky tests causing false passes)
1085
- - **False Positive Rate**: <3% (stable tests incorrectly flagged)
1086
- - **Detection Accuracy**: 98%
1087
-
1088
- ### Efficiency Metrics
1089
- - **Time to Detect Flakiness**: <1 hour (automated)
1090
- - **Time to Fix**: 80% fixed within 7 days
1091
- - **Quarantine Duration**: Average 8 days
1092
- - **Auto-Fix Success Rate**: 65%
1093
-
1094
- ### Business Metrics
1095
- - **CI Reliability**: 99.5% (no false failures blocking deployments)
1096
- - **Developer Trust**: 4.9/5 (high confidence in test results)
1097
- - **Time Saved**: 15 hours/week (no manual reruns)
1098
-
1099
- ## Commands
1100
-
1101
- ### Basic Commands
1102
-
1103
- ```bash
1104
- # Detect flaky tests
1105
- aqe flaky detect --days <number>
1106
-
1107
- # Analyze root cause
1108
- aqe flaky analyze --test <test-name>
1109
-
1110
- # Quarantine test
1111
- aqe flaky quarantine --test <test-name> --reason <reason>
1112
-
1113
- # Reinstate test
1114
- aqe flaky reinstate --test <test-name>
1115
-
1116
- # Generate report
1117
- aqe flaky report --output <file>
1118
- ```
1119
-
1120
- ### Advanced Commands
1121
-
1122
- ```bash
1123
- # Auto-fix flaky test
1124
- aqe flaky auto-fix --test <test-name> --validate
1125
-
1126
- # Track trends
1127
- aqe flaky trends --days <number> --format <html|chart|json>
1128
-
1129
- # Identify hotspots
1130
- aqe flaky hotspots --by <module|team|category>
1131
-
1132
- # Predict flakiness
1133
- aqe flaky predict --test <test-name> --changes <git-diff>
1134
-
1135
- # Review quarantined tests
1136
- aqe flaky review-quarantine --auto-reinstate
1137
- ```
1138
-
1139
- ### Specialized Commands
1140
-
1141
- ```bash
1142
- # Reliability scoring
1143
- aqe flaky reliability-score --test <test-name>
1144
-
1145
- # Bulk quarantine
1146
- aqe flaky bulk-quarantine --severity HIGH --days 7
1147
-
1148
- # Escalate overdue
1149
- aqe flaky escalate-overdue --threshold 30
1150
-
1151
- # Export quarantine dashboard
1152
- aqe flaky quarantine-dashboard --output dashboard.html
1153
-
1154
- # Flakiness heatmap
1155
- aqe flaky heatmap --by-module --output heatmap.png
1156
- ```
1157
-
1158
-
1159
- **Agent Status**: Production Ready
1160
- **Last Updated**: 2025-09-30
1161
- **Version**: 1.0.0
1162
- **Maintainer**: AQE Fleet Team