agentic-qe 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (116) hide show
  1. package/.claude/agents/qx-partner.md +17 -4
  2. package/.claude/skills/accessibility-testing/SKILL.md +144 -692
  3. package/.claude/skills/agentic-quality-engineering/SKILL.md +176 -529
  4. package/.claude/skills/api-testing-patterns/SKILL.md +180 -560
  5. package/.claude/skills/brutal-honesty-review/SKILL.md +113 -603
  6. package/.claude/skills/bug-reporting-excellence/SKILL.md +116 -517
  7. package/.claude/skills/chaos-engineering-resilience/SKILL.md +127 -72
  8. package/.claude/skills/cicd-pipeline-qe-orchestrator/SKILL.md +209 -404
  9. package/.claude/skills/code-review-quality/SKILL.md +158 -608
  10. package/.claude/skills/compatibility-testing/SKILL.md +148 -38
  11. package/.claude/skills/compliance-testing/SKILL.md +132 -63
  12. package/.claude/skills/consultancy-practices/SKILL.md +114 -446
  13. package/.claude/skills/context-driven-testing/SKILL.md +117 -381
  14. package/.claude/skills/contract-testing/SKILL.md +176 -141
  15. package/.claude/skills/database-testing/SKILL.md +137 -130
  16. package/.claude/skills/exploratory-testing-advanced/SKILL.md +160 -629
  17. package/.claude/skills/holistic-testing-pact/SKILL.md +140 -188
  18. package/.claude/skills/localization-testing/SKILL.md +145 -33
  19. package/.claude/skills/mobile-testing/SKILL.md +132 -448
  20. package/.claude/skills/mutation-testing/SKILL.md +147 -41
  21. package/.claude/skills/performance-testing/SKILL.md +200 -546
  22. package/.claude/skills/quality-metrics/SKILL.md +164 -519
  23. package/.claude/skills/refactoring-patterns/SKILL.md +132 -699
  24. package/.claude/skills/regression-testing/SKILL.md +120 -926
  25. package/.claude/skills/risk-based-testing/SKILL.md +157 -660
  26. package/.claude/skills/security-testing/SKILL.md +199 -538
  27. package/.claude/skills/sherlock-review/SKILL.md +163 -699
  28. package/.claude/skills/shift-left-testing/SKILL.md +161 -465
  29. package/.claude/skills/shift-right-testing/SKILL.md +161 -519
  30. package/.claude/skills/six-thinking-hats/SKILL.md +175 -1110
  31. package/.claude/skills/skills-manifest.json +71 -20
  32. package/.claude/skills/tdd-london-chicago/SKILL.md +131 -448
  33. package/.claude/skills/technical-writing/SKILL.md +103 -154
  34. package/.claude/skills/test-automation-strategy/SKILL.md +166 -772
  35. package/.claude/skills/test-data-management/SKILL.md +126 -910
  36. package/.claude/skills/test-design-techniques/SKILL.md +179 -89
  37. package/.claude/skills/test-environment-management/SKILL.md +136 -91
  38. package/.claude/skills/test-reporting-analytics/SKILL.md +169 -92
  39. package/.claude/skills/testability-scoring/SKILL.md +172 -538
  40. package/.claude/skills/testability-scoring/scripts/generate-html-report.js +0 -0
  41. package/.claude/skills/visual-testing-advanced/SKILL.md +155 -78
  42. package/.claude/skills/xp-practices/SKILL.md +151 -587
  43. package/CHANGELOG.md +48 -0
  44. package/README.md +23 -16
  45. package/dist/agents/QXPartnerAgent.d.ts +8 -1
  46. package/dist/agents/QXPartnerAgent.d.ts.map +1 -1
  47. package/dist/agents/QXPartnerAgent.js +1174 -112
  48. package/dist/agents/QXPartnerAgent.js.map +1 -1
  49. package/dist/agents/lifecycle/AgentLifecycleManager.d.ts.map +1 -1
  50. package/dist/agents/lifecycle/AgentLifecycleManager.js +34 -31
  51. package/dist/agents/lifecycle/AgentLifecycleManager.js.map +1 -1
  52. package/dist/cli/commands/init-claude-md-template.d.ts.map +1 -1
  53. package/dist/cli/commands/init-claude-md-template.js +14 -0
  54. package/dist/cli/commands/init-claude-md-template.js.map +1 -1
  55. package/dist/core/SwarmCoordinator.d.ts +180 -0
  56. package/dist/core/SwarmCoordinator.d.ts.map +1 -0
  57. package/dist/core/SwarmCoordinator.js +473 -0
  58. package/dist/core/SwarmCoordinator.js.map +1 -0
  59. package/dist/core/metrics/MetricsAggregator.d.ts +228 -0
  60. package/dist/core/metrics/MetricsAggregator.d.ts.map +1 -0
  61. package/dist/core/metrics/MetricsAggregator.js +482 -0
  62. package/dist/core/metrics/MetricsAggregator.js.map +1 -0
  63. package/dist/core/metrics/index.d.ts +5 -0
  64. package/dist/core/metrics/index.d.ts.map +1 -0
  65. package/dist/core/metrics/index.js +11 -0
  66. package/dist/core/metrics/index.js.map +1 -0
  67. package/dist/core/optimization/SwarmOptimizer.d.ts +5 -0
  68. package/dist/core/optimization/SwarmOptimizer.d.ts.map +1 -1
  69. package/dist/core/optimization/SwarmOptimizer.js +17 -0
  70. package/dist/core/optimization/SwarmOptimizer.js.map +1 -1
  71. package/dist/core/orchestration/AdaptiveScheduler.d.ts +190 -0
  72. package/dist/core/orchestration/AdaptiveScheduler.d.ts.map +1 -0
  73. package/dist/core/orchestration/AdaptiveScheduler.js +460 -0
  74. package/dist/core/orchestration/AdaptiveScheduler.js.map +1 -0
  75. package/dist/core/orchestration/WorkflowOrchestrator.d.ts +13 -0
  76. package/dist/core/orchestration/WorkflowOrchestrator.d.ts.map +1 -1
  77. package/dist/core/orchestration/WorkflowOrchestrator.js +32 -0
  78. package/dist/core/orchestration/WorkflowOrchestrator.js.map +1 -1
  79. package/dist/core/recovery/CircuitBreaker.d.ts +176 -0
  80. package/dist/core/recovery/CircuitBreaker.d.ts.map +1 -0
  81. package/dist/core/recovery/CircuitBreaker.js +382 -0
  82. package/dist/core/recovery/CircuitBreaker.js.map +1 -0
  83. package/dist/core/recovery/RecoveryOrchestrator.d.ts +186 -0
  84. package/dist/core/recovery/RecoveryOrchestrator.d.ts.map +1 -0
  85. package/dist/core/recovery/RecoveryOrchestrator.js +476 -0
  86. package/dist/core/recovery/RecoveryOrchestrator.js.map +1 -0
  87. package/dist/core/recovery/RetryStrategy.d.ts +127 -0
  88. package/dist/core/recovery/RetryStrategy.d.ts.map +1 -0
  89. package/dist/core/recovery/RetryStrategy.js +314 -0
  90. package/dist/core/recovery/RetryStrategy.js.map +1 -0
  91. package/dist/core/recovery/index.d.ts +8 -0
  92. package/dist/core/recovery/index.d.ts.map +1 -0
  93. package/dist/core/recovery/index.js +27 -0
  94. package/dist/core/recovery/index.js.map +1 -0
  95. package/dist/core/skills/DependencyResolver.d.ts +99 -0
  96. package/dist/core/skills/DependencyResolver.d.ts.map +1 -0
  97. package/dist/core/skills/DependencyResolver.js +260 -0
  98. package/dist/core/skills/DependencyResolver.js.map +1 -0
  99. package/dist/core/skills/ManifestGenerator.d.ts +114 -0
  100. package/dist/core/skills/ManifestGenerator.d.ts.map +1 -0
  101. package/dist/core/skills/ManifestGenerator.js +449 -0
  102. package/dist/core/skills/ManifestGenerator.js.map +1 -0
  103. package/dist/core/skills/index.d.ts +9 -0
  104. package/dist/core/skills/index.d.ts.map +1 -0
  105. package/dist/core/skills/index.js +24 -0
  106. package/dist/core/skills/index.js.map +1 -0
  107. package/dist/mcp/server.d.ts +9 -9
  108. package/dist/mcp/server.d.ts.map +1 -1
  109. package/dist/mcp/server.js +1 -2
  110. package/dist/mcp/server.js.map +1 -1
  111. package/dist/types/qx.d.ts +39 -7
  112. package/dist/types/qx.d.ts.map +1 -1
  113. package/dist/types/qx.js.map +1 -1
  114. package/dist/visualization/api/RestEndpoints.js +1 -1
  115. package/dist/visualization/api/RestEndpoints.js.map +1 -1
  116. package/package.json +13 -55
@@ -1,580 +1,225 @@
1
1
  ---
2
2
  name: quality-metrics
3
- description: Measure quality effectively with actionable metrics. Use when establishing quality dashboards, defining KPIs, or evaluating test effectiveness.
3
+ description: "Measure quality effectively with actionable metrics. Use when establishing quality dashboards, defining KPIs, or evaluating test effectiveness."
4
+ category: testing-methodologies
5
+ priority: high
6
+ tokenEstimate: 900
7
+ agents: [qe-quality-analyzer, qe-test-executor, qe-coverage-analyzer, qe-production-intelligence, qe-quality-gate]
8
+ implementation_status: optimized
9
+ optimization_version: 1.0
10
+ last_optimized: 2025-12-02
11
+ dependencies: []
12
+ quick_reference_card: true
13
+ tags: [metrics, dora, quality-gates, dashboards, kpis, measurement]
4
14
  ---
5
15
 
6
16
  # Quality Metrics
7
17
 
8
- ## Core Principle
18
+ <default_to_action>
19
+ When measuring quality or building dashboards:
20
+ 1. MEASURE outcomes (bug escape rate, MTTD) not activities (test count)
21
+ 2. FOCUS on DORA metrics: Deployment frequency, Lead time, MTTD, MTTR, Change failure rate
22
+ 3. AVOID vanity metrics: 100% coverage means nothing if tests don't catch bugs
23
+ 4. SET thresholds that drive behavior (quality gates block bad code)
24
+ 5. TREND over time: Direction matters more than absolute numbers
25
+
26
+ **Quick Metric Selection:**
27
+ - Speed: Deployment frequency, lead time for changes
28
+ - Stability: Change failure rate, MTTR
29
+ - Quality: Bug escape rate, defect density, test effectiveness
30
+ - Process: Code review time, flaky test rate
31
+
32
+ **Critical Success Factors:**
33
+ - Metrics without action are theater
34
+ - What you measure is what you optimize
35
+ - Trends matter more than snapshots
36
+ </default_to_action>
37
+
38
+ ## Quick Reference Card
39
+
40
+ ### When to Use
41
+ - Building quality dashboards
42
+ - Defining quality gates
43
+ - Evaluating testing effectiveness
44
+ - Justifying quality investments
45
+
46
+ ### Meaningful vs Vanity Metrics
47
+ | ✅ Meaningful | ❌ Vanity |
48
+ |--------------|-----------|
49
+ | Bug escape rate | Test case count |
50
+ | MTTD (detection) | Lines of test code |
51
+ | MTTR (recovery) | Test executions |
52
+ | Change failure rate | Coverage % (alone) |
53
+ | Lead time for changes | Requirements traced |
54
+
55
+ ### DORA Metrics
56
+ | Metric | Elite | High | Medium | Low |
57
+ |--------|-------|------|--------|-----|
58
+ | Deploy Frequency | On-demand | Weekly | Monthly | Yearly |
59
+ | Lead Time | < 1 hour | < 1 week | < 1 month | > 6 months |
60
+ | Change Failure Rate | < 5% | < 15% | < 30% | > 45% |
61
+ | MTTR | < 1 hour | < 1 day | < 1 week | > 1 month |
62
+
63
+ ### Quality Gate Thresholds
64
+ | Metric | Blocking Threshold | Warning |
65
+ |--------|-------------------|---------|
66
+ | Test pass rate | 100% | - |
67
+ | Critical coverage | > 80% | > 70% |
68
+ | Security critical | 0 | - |
69
+ | Performance p95 | < 200ms | < 500ms |
70
+ | Flaky tests | < 2% | < 5% |
9
71
 
10
- **Measure what matters, not what's easy to measure.**
11
-
12
- Metrics should drive better decisions, not just prettier dashboards. If a metric doesn't change behavior or inform action, stop tracking it.
13
-
14
- ## The Vanity Metrics Problem
15
-
16
- ### Vanity Metrics (Stop Measuring These)
17
-
18
- **Test Count**
19
- - "We have 5,000 tests!"
20
- - So what? Are they finding bugs? Are they maintainable? Do they give confidence?
21
-
22
- **Code Coverage Percentage**
23
- - "We achieved 85% coverage!"
24
- - Useless without context. 85% of what? Critical paths? Or just getters/setters?
25
-
26
- **Test Cases Executed**
27
- - "Ran 10,000 test cases today!"
28
- - How many found problems? How many are redundant?
29
-
30
- **Bugs Found**
31
- - "QA found 200 bugs this sprint!"
32
- - Is that good or bad? Are they trivial or critical? Should they have been found earlier?
33
-
34
- **Story Points Completed**
35
- - "We completed 50 points of testing work!"
36
- - Points are relative and gameable. What actually got better?
37
-
38
- ### Why Vanity Metrics Fail
39
-
40
- 1. **Easily gamed**: People optimize for the metric, not the goal
41
- 2. **No context**: Numbers without meaning
42
- 3. **No action**: What do you do differently based on this number?
43
- 4. **False confidence**: High numbers that mean nothing
44
-
45
- ## Meaningful Metrics
46
-
47
- ### 1. Defect Escape Rate
48
-
49
- **What**: Percentage of bugs that reach production vs. caught before release
72
+ ---
50
73
 
51
- **Why it matters**: Measures effectiveness of your quality process
74
+ ## Core Metrics
52
75
 
53
- **How to measure**:
54
- ```
55
- Defect Escape Rate = (Production Bugs / Total Bugs Found) × 100
76
+ ### Bug Escape Rate
56
77
  ```
78
+ Bug Escape Rate = (Production Bugs / Total Bugs Found) × 100
57
79
 
58
- **Good**: < 5% escape rate
59
- **Needs work**: > 15% escape rate
60
-
61
- **Actions**:
62
- - High escape rate → Shift testing left, improve risk assessment
63
- - Low escape rate but slow releases → Maybe over-testing, reduce friction
64
-
65
- ### 2. Mean Time to Detect (MTTD)
66
-
67
- **What**: How long from bug introduction to discovery
68
-
69
- **Why it matters**: Faster detection = cheaper fixes
70
-
71
- **How to measure**:
80
+ Target: < 10% (90% caught before production)
72
81
  ```
73
- MTTD = Time bug found - Time bug introduced
74
- ```
75
-
76
- **Good**: < 1 day for critical paths
77
- **Needs work**: > 1 week
78
-
79
- **Actions**:
80
- - High MTTD → Add monitoring, improve test coverage on critical paths
81
- - Very low MTTD → Your fast feedback loops are working
82
-
83
- ### 3. Mean Time to Resolution (MTTR)
84
-
85
- **What**: Time from bug discovery to fix deployed
86
82
 
87
- **Why it matters**: Indicates team efficiency and process friction
88
-
89
- **How to measure**:
90
- ```
91
- MTTR = Time fix deployed - Time bug discovered
83
+ ### Test Effectiveness
92
84
  ```
85
+ Test Effectiveness = (Bugs Found by Tests / Total Bugs) × 100
93
86
 
94
- **Good**: < 24 hours for critical bugs, < 1 week for minor
95
- **Needs work**: > 1 week for critical bugs
96
-
97
- **Actions**:
98
- - High MTTR → Investigate bottlenecks (test env access? deployment pipeline? handoffs?)
99
- - Very low MTTR but high escape rate → Rushing fixes, need better verification
100
-
101
- ### 4. Deployment Frequency
102
-
103
- **What**: How often you deploy to production
104
-
105
- **Why it matters**: Proxy for team confidence and process maturity
106
-
107
- **How to measure**:
108
- ```
109
- Deployments per week (or day)
87
+ Target: > 70%
110
88
  ```
111
89
 
112
- **Good**: Multiple per day
113
- **Decent**: Multiple per week
114
- **Needs work**: Less than weekly
115
-
116
- **Actions**:
117
- - Low frequency → Reduce batch size, improve automation, build confidence
118
- - High frequency with high defect rate → Need better automated checks
119
-
120
- ### 5. Change Failure Rate
121
-
122
- **What**: Percentage of deployments that cause production issues
123
-
124
- **Why it matters**: Measures release quality
125
-
126
- **How to measure**:
90
+ ### Defect Density
127
91
  ```
128
- Change Failure Rate = (Failed Deployments / Total Deployments) × 100
129
- ```
130
-
131
- **Good**: < 5%
132
- **Needs work**: > 15%
92
+ Defect Density = Defects / KLOC
133
93
 
134
- **Actions**:
135
- - High failure rate → Improve pre-production validation, add canary deployments
136
- - Very low but slow releases → Maybe you can deploy more frequently
137
-
138
- ### 6. Test Execution Time
139
-
140
- **What**: How long your test suite takes to run
141
-
142
- **Why it matters**: Slow tests = slow feedback = less frequent testing
143
-
144
- **How to measure**:
145
- ```
146
- Time from commit to test completion
94
+ Good: < 1 defect per KLOC
147
95
  ```
148
96
 
149
- **Good**: < 10 minutes for unit tests, < 30 minutes for full suite
150
- **Needs work**: > 1 hour
151
-
152
- **Actions**:
153
- - Slow tests → Parallelize, remove redundant tests, optimize slow tests
154
- - Fast tests but bugs escaping → Coverage gaps, need better tests
155
-
156
- ### 7. Flaky Test Rate
157
-
158
- **What**: Percentage of tests that fail intermittently
159
-
160
- **Why it matters**: Flaky tests destroy confidence
161
-
162
- **How to measure**:
97
+ ### Mean Time to Detect (MTTD)
163
98
  ```
164
- Flaky Test Rate = (Flaky Tests / Total Tests) × 100
165
- ```
166
-
167
- **Good**: < 1%
168
- **Needs work**: > 5%
169
-
170
- **Actions**:
171
- - High flakiness → Fix or delete flaky tests immediately (quarantine pattern)
172
- - Low flakiness → Maintain vigilance, don't let it creep up
173
-
174
- ## Context-Specific Metrics
175
-
176
- ### For Startups
177
-
178
- **Focus on**:
179
- - Deployment frequency (speed to market)
180
- - Critical path coverage (protect revenue)
181
- - MTTR (move fast, fix fast)
182
-
183
- **Skip**:
184
- - Comprehensive coverage metrics
185
- - Detailed test documentation
186
- - Complex traceability
187
-
188
- ### For Regulated Industries
189
-
190
- **Focus on**:
191
- - Traceability (requirement → test → result)
192
- - Test documentation completeness
193
- - Audit trail integrity
194
-
195
- **Don't skip**:
196
- - Deployment frequency still matters
197
- - But compliance isn't optional
198
-
199
- ### For Established Products
200
-
201
- **Focus on**:
202
- - Defect escape rate (protect reputation)
203
- - Regression detection (maintain stability)
204
- - Test maintenance cost
205
-
206
- **Balance**:
207
- - Innovation vs. stability
208
- - New features vs. technical debt
209
-
210
- ## Leading vs. Lagging Indicators
211
-
212
- ### Lagging Indicators (Rearview Mirror)
213
- - Defect escape rate
214
- - Production incidents
215
- - Customer complaints
216
- - MTTR
217
-
218
- **Use for**: Understanding what happened, trending over time
219
-
220
- ### Leading Indicators (Windshield)
221
- - Code review quality
222
- - Test coverage on new code
223
- - Deployment frequency trend
224
- - Team confidence surveys
225
-
226
- **Use for**: Predicting problems, early intervention
227
-
228
- ## Metrics for Different Audiences
229
-
230
- ### For Developers
231
- - Test execution time
232
- - Flaky test rate
233
- - Code review turnaround
234
- - Build failure frequency
235
-
236
- **Language**: Technical, actionable
237
-
238
- ### For Product/Management
239
- - Deployment frequency
240
- - Change failure rate
241
- - Feature lead time
242
- - Customer-impacting incidents
243
-
244
- **Language**: Business outcomes, not technical details
245
-
246
- ### For Executive Leadership
247
- - Defect escape rate trend
248
- - Mean time to resolution
249
- - Release velocity
250
- - Customer satisfaction (related to quality)
251
-
252
- **Language**: Business impact, strategic
253
-
254
- ## Building a Metrics Dashboard
255
-
256
- ### Essential Dashboard (Start Here)
257
-
258
- **Top Row (Health)**
259
- - Defect escape rate (last 30 days)
260
- - Deployment frequency (last 7 days)
261
- - Change failure rate (last 30 days)
262
-
263
- **Middle Row (Speed)**
264
- - MTTD (average, last 30 days)
265
- - MTTR (average, last 30 days)
266
- - Test execution time (current)
267
-
268
- **Bottom Row (Trends)**
269
- - All of the above as sparklines (3-6 months)
99
+ MTTD = Time(Bug Reported) - Time(Bug Introduced)
270
100
 
271
- ### Advanced Dashboard (If Needed)
272
-
273
- Add:
274
- - Flaky test rate
275
- - Test coverage on critical paths (not overall %)
276
- - Production error rate
277
- - Customer-reported bugs vs. internally found
278
-
279
- ## Anti-Patterns
280
-
281
- ### ❌ Metric-Driven Development
282
- **Problem**: Optimizing for metrics instead of quality
283
-
284
- **Example**: Writing useless tests to hit coverage targets
285
-
286
- **Fix**: Focus on outcomes (can we deploy confidently?) not numbers
287
-
288
- ### ❌ Too Many Metrics
289
- **Problem**: Dashboard overload, no clear priorities
290
-
291
- **Example**: Tracking 30+ metrics that no one understands
292
-
293
- **Fix**: Start with 5-7 core metrics, add only if they drive decisions
294
-
295
- ### ❌ Metrics Without Action
296
- **Problem**: Tracking numbers but not changing behavior
297
-
298
- **Example**: Watching MTTR climb for months without investigating
299
-
300
- **Fix**: For every metric, define thresholds and actions
301
-
302
- ### ❌ Gaming the System
303
- **Problem**: People optimize for metrics, not quality
304
-
305
- **Example**: Marking bugs as "won't fix" to improve resolution time
306
-
307
- **Fix**: Multiple complementary metrics, qualitative reviews
308
-
309
- ### ❌ One-Size-Fits-All
310
- **Problem**: Using same metrics for all teams/contexts
311
-
312
- **Example**: Measuring startup team same as regulated medical device team
313
-
314
- **Fix**: Context-driven metric selection
315
-
316
- ## Metric Hygiene
317
-
318
- ### Review Quarterly
319
- - Are we still using this metric to make decisions?
320
- - Is it being gamed?
321
- - Does it reflect current priorities?
322
-
323
- ### Adjust Thresholds
324
- - What's "good" changes as you improve
325
- - Don't keep celebrating the same baseline
326
- - Raise the bar when appropriate
327
-
328
- ### Kill Zombie Metrics
329
- - If no one looks at it → Delete it
330
- - If no one can explain what action to take → Delete it
331
- - If it's always green or always red → Delete it
332
-
333
- ## Real-World Examples
334
-
335
- ### Example 1: E-Commerce Company
336
-
337
- **Before**:
338
- - Measured: Test count (5,000 tests)
339
- - Result: Slow CI, frequent production bugs
340
-
341
- **After**:
342
- - Measured: Defect escape rate (8%), MTTD (3 days), deployment frequency (2/week)
343
- - Actions:
344
- - Removed 2,000 redundant tests
345
- - Added monitoring for critical paths
346
- - Improved deployment pipeline
347
- - Result: Escape rate to 3%, MTTD to 6 hours, deploy 5x/day
348
-
349
- ### Example 2: SaaS Platform
350
-
351
- **Before**:
352
- - Measured: Code coverage (85%)
353
- - Result: False confidence, bugs in uncovered critical paths
354
-
355
- **After**:
356
- - Measured: Critical path coverage (60%), deployment frequency, change failure rate
357
- - Actions:
358
- - Focused testing on payment, auth, data integrity
359
- - Removed tests on deprecated features
360
- - Added production monitoring
361
- - Result: Fewer production incidents, faster releases
362
-
363
- ## Questions to Ask About Any Metric
364
-
365
- 1. **What decision does this inform?**
366
- - If none → Don't track it
367
-
368
- 2. **What action do we take if it's red?**
369
- - If you don't know → Define thresholds and actions
370
-
371
- 3. **Can this be gamed?**
372
- - If yes → Add complementary metrics
373
-
374
- 4. **Does this reflect actual quality?**
375
- - If no → Replace it with something that does
376
-
377
- 5. **Who needs to see this?**
378
- - If no one → Stop tracking it
379
-
380
- ## Remember
381
-
382
- **Good metrics**:
383
- - Drive better decisions
384
- - Are actionable
385
- - Reflect actual outcomes
386
- - Change as you mature
387
-
388
- **Bad metrics**:
389
- - Make dashboards pretty
390
- - Are easily gamed
391
- - Provide false confidence
392
- - Persist long after they're useful
393
-
394
- **Start small**: 5-7 metrics that matter
395
- **Review often**: Quarterly at minimum
396
- **Kill ruthlessly**: Remove metrics that don't drive action
397
- **Stay contextual**: What matters changes with your situation
398
-
399
- ## Using with QE Agents
400
-
401
- ### Automated Metrics Collection
402
-
403
- **qe-quality-analyzer** collects and analyzes quality metrics:
404
- ```typescript
405
- // Agent collects comprehensive metrics automatically
406
- await agent.collectMetrics({
407
- scope: 'all',
408
- timeframe: '30d',
409
- categories: [
410
- 'deployment-frequency',
411
- 'defect-escape-rate',
412
- 'test-execution-time',
413
- 'flaky-test-rate',
414
- 'coverage-trends'
415
- ]
416
- });
417
-
418
- // Returns real-time dashboard data
419
- // No manual tracking required
101
+ Target: < 1 day for critical, < 1 week for others
420
102
  ```
421
103
 
422
- ### Intelligent Metric Analysis
104
+ ---
423
105
 
424
- **qe-quality-analyzer** identifies trends and anomalies:
425
- ```typescript
426
- // Agent detects metric anomalies
427
- const analysis = await agent.analyzeTrends({
428
- metric: 'defect-escape-rate',
429
- timeframe: '90d',
430
- alertThreshold: 0.15
431
- });
106
+ ## Dashboard Design
432
107
 
433
- // Returns:
434
- // {
435
- // trend: 'increasing',
436
- // currentValue: 0.18,
437
- // avgValue: 0.08,
438
- // anomaly: true,
439
- // recommendation: 'Increase pre-release testing focus',
440
- // relatedMetrics: ['test-coverage: decreasing', 'MTTR: increasing']
441
- // }
108
+ ```typescript
109
+ // Agent generates quality dashboard
110
+ await Task("Generate Dashboard", {
111
+ metrics: {
112
+ delivery: ['deployment-frequency', 'lead-time', 'change-failure-rate'],
113
+ quality: ['bug-escape-rate', 'test-effectiveness', 'defect-density'],
114
+ stability: ['mttd', 'mttr', 'availability'],
115
+ process: ['code-review-time', 'flaky-test-rate', 'coverage-trend']
116
+ },
117
+ visualization: 'grafana',
118
+ alerts: {
119
+ critical: { bug_escape_rate: '>20%', mttr: '>24h' },
120
+ warning: { coverage: '<70%', flaky_rate: '>5%' }
121
+ }
122
+ }, "qe-quality-analyzer");
442
123
  ```
443
124
 
444
- ### Actionable Insights from Metrics
125
+ ---
445
126
 
446
- **qe-quality-gate** uses metrics for decision-making:
447
- ```typescript
448
- // Agent makes GO/NO-GO decisions based on metrics
449
- const decision = await agent.evaluateMetrics({
450
- release: 'v3.2',
451
- thresholds: {
452
- defectEscapeRate: '<5%',
453
- changeFailureRate: '<10%',
454
- testExecutionTime: '<15min',
455
- flakyTestRate: '<2%'
127
+ ## Quality Gate Configuration
128
+
129
+ ```json
130
+ {
131
+ "qualityGates": {
132
+ "commit": {
133
+ "coverage": { "min": 80, "blocking": true },
134
+ "lint": { "errors": 0, "blocking": true }
135
+ },
136
+ "pr": {
137
+ "tests": { "pass": "100%", "blocking": true },
138
+ "security": { "critical": 0, "blocking": true },
139
+ "coverage_delta": { "min": 0, "blocking": false }
140
+ },
141
+ "release": {
142
+ "e2e": { "pass": "100%", "blocking": true },
143
+ "performance_p95": { "max_ms": 200, "blocking": true },
144
+ "bug_escape_rate": { "max": "10%", "blocking": false }
145
+ }
456
146
  }
457
- });
458
-
459
- // Returns:
460
- // {
461
- // decision: 'NO-GO',
462
- // blockers: [
463
- // 'Flaky test rate: 4.2% (threshold: 2%)'
464
- // ],
465
- // recommendations: [
466
- // 'Run qe-flaky-test-hunter to stabilize tests'
467
- // ]
468
- // }
147
+ }
469
148
  ```
470
149
 
471
- ### Real-Time Metrics Dashboard
150
+ ---
151
+
152
+ ## Agent-Assisted Metrics
472
153
 
473
- **qe-quality-analyzer** generates live dashboards:
474
154
  ```typescript
475
- // Agent creates context-specific dashboards
476
- await agent.createDashboard({
477
- audience: 'executive', // or 'developer', 'product'
478
- focus: 'release-readiness',
479
- updateFrequency: 'real-time'
480
- });
155
+ // Calculate quality trends
156
+ await Task("Quality Trend Analysis", {
157
+ timeframe: '90d',
158
+ metrics: ['bug-escape-rate', 'mttd', 'test-effectiveness'],
159
+ compare: 'previous-90d',
160
+ predictNext: '30d'
161
+ }, "qe-quality-analyzer");
481
162
 
482
- // Executive Dashboard:
483
- // - Defect escape rate: 3.2% ✅
484
- // - Deployment frequency: 5/day ✅
485
- // - Change failure rate: 7% ✅
486
- // - Customer-impacting incidents: 1 (down from 3)
163
+ // Evaluate quality gate
164
+ await Task("Quality Gate Evaluation", {
165
+ buildId: 'build-123',
166
+ environment: 'staging',
167
+ metrics: currentMetrics,
168
+ policy: qualityPolicy
169
+ }, "qe-quality-gate");
487
170
  ```
488
171
 
489
- ### Metric-Driven Test Optimization
172
+ ---
490
173
 
491
- **qe-regression-risk-analyzer** uses metrics to optimize testing:
492
- ```typescript
493
- // Agent identifies which tests provide most value
494
- const optimization = await agent.optimizeTestSuite({
495
- metrics: {
496
- executionTime: 'per-test',
497
- defectDetectionRate: 'per-test',
498
- maintenanceCost: 'per-test'
499
- },
500
- goal: 'maximize-value-per-minute'
501
- });
174
+ ## Agent Coordination Hints
502
175
 
503
- // Recommends:
504
- // - Remove 50 tests with 0% defect detection (save 15 min)
505
- // - Keep top 200 tests (95% defect detection)
506
- // - Result: 40% faster suite, 5% defect detection loss
176
+ ### Memory Namespace
177
+ ```
178
+ aqe/quality-metrics/
179
+ ├── dashboards/* - Dashboard configurations
180
+ ├── trends/* - Historical metric data
181
+ ├── gates/* - Gate evaluation results
182
+ └── alerts/* - Triggered alerts
507
183
  ```
508
184
 
509
- ### Fleet Coordination for Metrics
510
-
185
+ ### Fleet Coordination
511
186
  ```typescript
512
- // Multiple agents collaborate on metrics collection and analysis
513
187
  const metricsFleet = await FleetManager.coordinate({
514
188
  strategy: 'quality-metrics',
515
189
  agents: [
516
- 'qe-test-executor', // Collect execution metrics
517
- 'qe-coverage-analyzer', // Collect coverage metrics
518
- 'qe-production-intelligence', // Collect production metrics
519
- 'qe-quality-analyzer', // Analyze and visualize
520
- 'qe-quality-gate' // Make decisions
190
+ 'qe-quality-analyzer', // Trend analysis
191
+ 'qe-test-executor', // Test metrics
192
+ 'qe-coverage-analyzer', // Coverage data
193
+ 'qe-production-intelligence', // Production metrics
194
+ 'qe-quality-gate' // Gate decisions
521
195
  ],
522
- topology: 'hierarchical'
523
- });
524
-
525
- // Continuous metrics pipeline
526
- await metricsFleet.execute({
527
- schedule: 'continuous',
528
- aggregationInterval: '5min'
196
+ topology: 'mesh'
529
197
  });
530
198
  ```
531
199
 
532
- ### Context-Aware Metric Selection
200
+ ---
533
201
 
534
- ```typescript
535
- // Agent recommends metrics based on context
536
- const recommendation = await qe-quality-analyzer.recommendMetrics({
537
- context: 'startup',
538
- stage: 'early',
539
- team: 'small',
540
- compliance: 'none'
541
- });
202
+ ## Common Traps
542
203
 
543
- // Recommends:
544
- // - deployment-frequency (speed to market)
545
- // - critical-path-coverage (protect revenue)
546
- // - MTTR (move fast, fix fast)
547
- //
548
- // Skip:
549
- // - comprehensive coverage %
550
- // - detailed traceability
551
- // - process compliance metrics
552
- ```
204
+ | Trap | Problem | Solution |
205
+ |------|---------|----------|
206
+ | Coverage worship | 100% coverage, bugs still escape | Measure bug escape rate instead |
207
+ | Test count focus | Many tests, slow feedback | Measure execution time |
208
+ | Activity metrics | Busy work, no outcomes | Measure outcomes (MTTD, MTTR) |
209
+ | Point-in-time | Snapshot without context | Track trends over time |
553
210
 
554
211
  ---
555
212
 
556
213
  ## Related Skills
557
-
558
- **Core Quality Practices:**
559
- - [agentic-quality-engineering](../agentic-quality-engineering/) - Metrics-driven agent coordination
560
- - [holistic-testing-pact](../holistic-testing-pact/) - Metrics across test quadrants
561
-
562
- **Testing Approaches:**
563
- - [risk-based-testing](../risk-based-testing/) - Risk-based metric selection
564
- - [test-automation-strategy](../test-automation-strategy/) - Automation effectiveness metrics
565
- - [exploratory-testing-advanced](../exploratory-testing-advanced/) - Exploratory session metrics
566
-
567
- **Development Practices:**
568
- - [xp-practices](../xp-practices/) - XP success metrics (velocity, lead time)
214
+ - [agentic-quality-engineering](../agentic-quality-engineering/) - Agent coordination
215
+ - [cicd-pipeline-qe-orchestrator](../cicd-pipeline-qe-orchestrator/) - Quality gates
216
+ - [risk-based-testing](../risk-based-testing/) - Risk-informed metrics
217
+ - [shift-right-testing](../shift-right-testing/) - Production metrics
569
218
 
570
219
  ---
571
220
 
572
- ## Resources
573
-
574
- - **Accelerate** by Forsgren, Humble, Kim (DORA metrics)
575
- - **How to Measure Anything** by Douglas Hubbard (measuring intangibles)
576
- - Your own retrospectives (which metrics helped? Which didn't?)
221
+ ## Remember
577
222
 
578
- Metrics are tools for better decisions, not scorecards for performance reviews. Use them wisely.
223
+ **Measure outcomes, not activities.** Bug escape rate > test count. MTTD/MTTR > coverage %. Trends > snapshots. Set gates that block bad code. What you measure is what you optimize.
579
224
 
580
- **With Agents**: Agents automate metrics collection, detect trends and anomalies, and provide context-aware recommendations. Use agents to make metrics actionable and avoid vanity metrics. Agents continuously analyze what drives quality outcomes in your specific context.
225
+ **With Agents:** Agents track metrics automatically, analyze trends, trigger alerts, and make gate decisions. Use agents to maintain continuous quality visibility.