@athenaflow/plugin-e2e-test-builder 2.0.9 → 2.0.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (167) hide show
  1. package/.claude-plugin/plugin.json +1 -1
  2. package/.codex-plugin/plugin.json +1 -1
  3. package/dist/{2.0.8 → 2.0.10}/.agents/plugins/marketplace.json +1 -1
  4. package/dist/{2.0.9 → 2.0.10}/claude/plugin/.claude-plugin/plugin.json +1 -1
  5. package/dist/{2.0.9 → 2.0.10}/claude/plugin/package.json +8 -2
  6. package/dist/{2.0.9 → 2.0.10}/claude/plugin/skills/add-e2e-tests/SKILL.md +18 -65
  7. package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/add-e2e-tests/agents/openai.yaml +1 -1
  8. package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/add-e2e-tests/references/error-recovery.md +3 -3
  9. package/dist/{2.0.8/codex → 2.0.10/claude}/plugin/skills/add-e2e-tests/references/scaffolding.md +1 -1
  10. package/dist/{2.0.9 → 2.0.10}/claude/plugin/skills/fix-flaky-tests/SKILL.md +1 -1
  11. package/dist/{2.0.8/codex → 2.0.10/claude}/plugin/skills/fix-flaky-tests/references/fix-patterns.md +3 -2
  12. package/dist/{2.0.9 → 2.0.10}/claude/plugin/skills/generate-test-cases/SKILL.md +8 -2
  13. package/dist/{2.0.9 → 2.0.10}/claude/plugin/skills/plan-test-coverage/SKILL.md +7 -6
  14. package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/review-test-cases/SKILL.md +3 -4
  15. package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/SKILL.md +4 -3
  16. package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/references/api-setup-teardown.md +1 -1
  17. package/dist/{2.0.9 → 2.0.10}/codex/plugin/.codex-plugin/plugin.json +1 -1
  18. package/dist/{2.0.9 → 2.0.10}/codex/plugin/package.json +8 -2
  19. package/dist/{2.0.9 → 2.0.10}/codex/plugin/skills/add-e2e-tests/SKILL.md +18 -65
  20. package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/add-e2e-tests/agents/openai.yaml +1 -1
  21. package/dist/{2.0.9/claude → 2.0.10/codex}/plugin/skills/add-e2e-tests/references/error-recovery.md +3 -3
  22. package/dist/{2.0.9/claude → 2.0.10/codex}/plugin/skills/add-e2e-tests/references/scaffolding.md +1 -1
  23. package/dist/{2.0.8/claude → 2.0.10/codex}/plugin/skills/fix-flaky-tests/SKILL.md +1 -1
  24. package/dist/{2.0.9/claude → 2.0.10/codex}/plugin/skills/fix-flaky-tests/references/fix-patterns.md +3 -2
  25. package/dist/{2.0.9 → 2.0.10}/codex/plugin/skills/generate-test-cases/SKILL.md +8 -2
  26. package/dist/{2.0.9 → 2.0.10}/codex/plugin/skills/plan-test-coverage/SKILL.md +7 -6
  27. package/dist/{2.0.9/claude → 2.0.10/codex}/plugin/skills/review-test-cases/SKILL.md +3 -4
  28. package/dist/{2.0.9/claude → 2.0.10/codex}/plugin/skills/write-test-code/SKILL.md +4 -3
  29. package/dist/{2.0.9/claude → 2.0.10/codex}/plugin/skills/write-test-code/references/api-setup-teardown.md +1 -1
  30. package/dist/{2.0.9 → 2.0.10}/release.json +1 -1
  31. package/package.json +7 -1
  32. package/skills/add-e2e-tests/SKILL.md +18 -65
  33. package/skills/add-e2e-tests/agents/openai.yaml +1 -1
  34. package/skills/add-e2e-tests/references/error-recovery.md +3 -3
  35. package/skills/add-e2e-tests/references/scaffolding.md +1 -1
  36. package/skills/fix-flaky-tests/SKILL.md +1 -1
  37. package/skills/fix-flaky-tests/references/fix-patterns.md +3 -2
  38. package/skills/generate-test-cases/SKILL.md +8 -2
  39. package/skills/plan-test-coverage/SKILL.md +7 -6
  40. package/skills/review-test-cases/SKILL.md +3 -4
  41. package/skills/write-test-code/SKILL.md +4 -3
  42. package/skills/write-test-code/references/api-setup-teardown.md +1 -1
  43. package/dist/2.0.8/claude/plugin/.claude-plugin/plugin.json +0 -20
  44. package/dist/2.0.8/claude/plugin/package.json +0 -9
  45. package/dist/2.0.8/claude/plugin/skills/add-e2e-tests/SKILL.md +0 -217
  46. package/dist/2.0.8/claude/plugin/skills/add-e2e-tests/agents/claude.yaml +0 -1
  47. package/dist/2.0.8/claude/plugin/skills/add-e2e-tests/references/scaffolding.md +0 -12
  48. package/dist/2.0.8/claude/plugin/skills/add-e2e-tests/references/tracker-template.md +0 -53
  49. package/dist/2.0.8/claude/plugin/skills/fix-flaky-tests/references/fix-patterns.md +0 -91
  50. package/dist/2.0.8/claude/plugin/skills/generate-test-cases/SKILL.md +0 -184
  51. package/dist/2.0.8/claude/plugin/skills/plan-test-coverage/SKILL.md +0 -116
  52. package/dist/2.0.8/codex/plugin/.codex-plugin/plugin.json +0 -15
  53. package/dist/2.0.8/codex/plugin/package.json +0 -9
  54. package/dist/2.0.8/codex/plugin/skills/add-e2e-tests/SKILL.md +0 -217
  55. package/dist/2.0.8/codex/plugin/skills/add-e2e-tests/agents/claude.yaml +0 -1
  56. package/dist/2.0.8/codex/plugin/skills/add-e2e-tests/references/error-recovery.md +0 -43
  57. package/dist/2.0.8/codex/plugin/skills/add-e2e-tests/references/tracker-template.md +0 -53
  58. package/dist/2.0.8/codex/plugin/skills/fix-flaky-tests/SKILL.md +0 -160
  59. package/dist/2.0.8/codex/plugin/skills/generate-test-cases/SKILL.md +0 -184
  60. package/dist/2.0.8/codex/plugin/skills/plan-test-coverage/SKILL.md +0 -116
  61. package/dist/2.0.8/codex/plugin/skills/review-test-cases/SKILL.md +0 -147
  62. package/dist/2.0.8/codex/plugin/skills/write-test-code/SKILL.md +0 -227
  63. package/dist/2.0.8/codex/plugin/skills/write-test-code/references/api-setup-teardown.md +0 -83
  64. package/dist/2.0.8/release.json +0 -18
  65. package/dist/2.0.9/.agents/plugins/marketplace.json +0 -14
  66. package/dist/2.0.9/claude/plugin/skills/add-e2e-tests/agents/openai.yaml +0 -10
  67. package/dist/2.0.9/claude/plugin/skills/add-e2e-tests/references/authentication.md +0 -8
  68. package/dist/2.0.9/claude/plugin/skills/add-e2e-tests/references/tracker-template.md +0 -53
  69. package/dist/2.0.9/claude/plugin/skills/analyze-test-codebase/SKILL.md +0 -142
  70. package/dist/2.0.9/claude/plugin/skills/analyze-test-codebase/agents/claude.yaml +0 -3
  71. package/dist/2.0.9/claude/plugin/skills/analyze-test-codebase/agents/openai.yaml +0 -4
  72. package/dist/2.0.9/claude/plugin/skills/fix-flaky-tests/agents/claude.yaml +0 -3
  73. package/dist/2.0.9/claude/plugin/skills/fix-flaky-tests/agents/openai.yaml +0 -10
  74. package/dist/2.0.9/claude/plugin/skills/generate-test-cases/agents/claude.yaml +0 -3
  75. package/dist/2.0.9/claude/plugin/skills/generate-test-cases/agents/openai.yaml +0 -10
  76. package/dist/2.0.9/claude/plugin/skills/generate-test-cases/references/scenario-categories.md +0 -36
  77. package/dist/2.0.9/claude/plugin/skills/plan-test-coverage/agents/claude.yaml +0 -3
  78. package/dist/2.0.9/claude/plugin/skills/plan-test-coverage/agents/openai.yaml +0 -10
  79. package/dist/2.0.9/claude/plugin/skills/review-test-cases/agents/claude.yaml +0 -3
  80. package/dist/2.0.9/claude/plugin/skills/review-test-cases/agents/openai.yaml +0 -10
  81. package/dist/2.0.9/claude/plugin/skills/review-test-code/SKILL.md +0 -189
  82. package/dist/2.0.9/claude/plugin/skills/review-test-code/agents/claude.yaml +0 -3
  83. package/dist/2.0.9/claude/plugin/skills/review-test-code/agents/openai.yaml +0 -10
  84. package/dist/2.0.9/claude/plugin/skills/write-test-code/agents/claude.yaml +0 -3
  85. package/dist/2.0.9/claude/plugin/skills/write-test-code/agents/openai.yaml +0 -10
  86. package/dist/2.0.9/claude/plugin/skills/write-test-code/references/anti-patterns.md +0 -88
  87. package/dist/2.0.9/claude/plugin/skills/write-test-code/references/auth-patterns.md +0 -63
  88. package/dist/2.0.9/claude/plugin/skills/write-test-code/references/mapping-tables.md +0 -56
  89. package/dist/2.0.9/claude/plugin/skills/write-test-code/references/network-interception.md +0 -56
  90. package/dist/2.0.9/codex/plugin/skills/add-e2e-tests/agents/openai.yaml +0 -10
  91. package/dist/2.0.9/codex/plugin/skills/add-e2e-tests/references/authentication.md +0 -8
  92. package/dist/2.0.9/codex/plugin/skills/add-e2e-tests/references/error-recovery.md +0 -43
  93. package/dist/2.0.9/codex/plugin/skills/add-e2e-tests/references/scaffolding.md +0 -12
  94. package/dist/2.0.9/codex/plugin/skills/add-e2e-tests/references/tracker-template.md +0 -53
  95. package/dist/2.0.9/codex/plugin/skills/analyze-test-codebase/SKILL.md +0 -142
  96. package/dist/2.0.9/codex/plugin/skills/analyze-test-codebase/agents/claude.yaml +0 -3
  97. package/dist/2.0.9/codex/plugin/skills/analyze-test-codebase/agents/openai.yaml +0 -4
  98. package/dist/2.0.9/codex/plugin/skills/fix-flaky-tests/SKILL.md +0 -160
  99. package/dist/2.0.9/codex/plugin/skills/fix-flaky-tests/agents/claude.yaml +0 -3
  100. package/dist/2.0.9/codex/plugin/skills/fix-flaky-tests/agents/openai.yaml +0 -10
  101. package/dist/2.0.9/codex/plugin/skills/fix-flaky-tests/references/fix-patterns.md +0 -91
  102. package/dist/2.0.9/codex/plugin/skills/generate-test-cases/agents/claude.yaml +0 -3
  103. package/dist/2.0.9/codex/plugin/skills/generate-test-cases/agents/openai.yaml +0 -10
  104. package/dist/2.0.9/codex/plugin/skills/generate-test-cases/references/scenario-categories.md +0 -36
  105. package/dist/2.0.9/codex/plugin/skills/plan-test-coverage/agents/claude.yaml +0 -3
  106. package/dist/2.0.9/codex/plugin/skills/plan-test-coverage/agents/openai.yaml +0 -10
  107. package/dist/2.0.9/codex/plugin/skills/review-test-cases/SKILL.md +0 -147
  108. package/dist/2.0.9/codex/plugin/skills/review-test-cases/agents/claude.yaml +0 -3
  109. package/dist/2.0.9/codex/plugin/skills/review-test-cases/agents/openai.yaml +0 -10
  110. package/dist/2.0.9/codex/plugin/skills/review-test-code/SKILL.md +0 -189
  111. package/dist/2.0.9/codex/plugin/skills/review-test-code/agents/claude.yaml +0 -3
  112. package/dist/2.0.9/codex/plugin/skills/review-test-code/agents/openai.yaml +0 -10
  113. package/dist/2.0.9/codex/plugin/skills/write-test-code/SKILL.md +0 -227
  114. package/dist/2.0.9/codex/plugin/skills/write-test-code/agents/claude.yaml +0 -3
  115. package/dist/2.0.9/codex/plugin/skills/write-test-code/agents/openai.yaml +0 -10
  116. package/dist/2.0.9/codex/plugin/skills/write-test-code/references/anti-patterns.md +0 -88
  117. package/dist/2.0.9/codex/plugin/skills/write-test-code/references/api-setup-teardown.md +0 -83
  118. package/dist/2.0.9/codex/plugin/skills/write-test-code/references/auth-patterns.md +0 -63
  119. package/dist/2.0.9/codex/plugin/skills/write-test-code/references/mapping-tables.md +0 -56
  120. package/dist/2.0.9/codex/plugin/skills/write-test-code/references/network-interception.md +0 -56
  121. package/skills/add-e2e-tests/references/tracker-template.md +0 -53
  122. /package/dist/{2.0.9 → 2.0.10}/claude/plugin/skills/add-e2e-tests/agents/claude.yaml +0 -0
  123. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/add-e2e-tests/references/authentication.md +0 -0
  124. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/analyze-test-codebase/SKILL.md +0 -0
  125. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/analyze-test-codebase/agents/claude.yaml +0 -0
  126. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/analyze-test-codebase/agents/openai.yaml +0 -0
  127. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/fix-flaky-tests/agents/claude.yaml +0 -0
  128. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/fix-flaky-tests/agents/openai.yaml +0 -0
  129. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/generate-test-cases/agents/claude.yaml +0 -0
  130. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/generate-test-cases/agents/openai.yaml +0 -0
  131. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/generate-test-cases/references/scenario-categories.md +0 -0
  132. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/plan-test-coverage/agents/claude.yaml +0 -0
  133. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/plan-test-coverage/agents/openai.yaml +0 -0
  134. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/review-test-cases/agents/claude.yaml +0 -0
  135. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/review-test-cases/agents/openai.yaml +0 -0
  136. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/review-test-code/SKILL.md +0 -0
  137. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/review-test-code/agents/claude.yaml +0 -0
  138. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/review-test-code/agents/openai.yaml +0 -0
  139. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/agents/claude.yaml +0 -0
  140. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/agents/openai.yaml +0 -0
  141. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/references/anti-patterns.md +0 -0
  142. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/references/auth-patterns.md +0 -0
  143. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/references/mapping-tables.md +0 -0
  144. /package/dist/{2.0.8 → 2.0.10}/claude/plugin/skills/write-test-code/references/network-interception.md +0 -0
  145. /package/dist/{2.0.9 → 2.0.10}/codex/plugin/skills/add-e2e-tests/agents/claude.yaml +0 -0
  146. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/add-e2e-tests/references/authentication.md +0 -0
  147. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/analyze-test-codebase/SKILL.md +0 -0
  148. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/analyze-test-codebase/agents/claude.yaml +0 -0
  149. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/analyze-test-codebase/agents/openai.yaml +0 -0
  150. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/fix-flaky-tests/agents/claude.yaml +0 -0
  151. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/fix-flaky-tests/agents/openai.yaml +0 -0
  152. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/generate-test-cases/agents/claude.yaml +0 -0
  153. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/generate-test-cases/agents/openai.yaml +0 -0
  154. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/generate-test-cases/references/scenario-categories.md +0 -0
  155. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/plan-test-coverage/agents/claude.yaml +0 -0
  156. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/plan-test-coverage/agents/openai.yaml +0 -0
  157. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/review-test-cases/agents/claude.yaml +0 -0
  158. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/review-test-cases/agents/openai.yaml +0 -0
  159. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/review-test-code/SKILL.md +0 -0
  160. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/review-test-code/agents/claude.yaml +0 -0
  161. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/review-test-code/agents/openai.yaml +0 -0
  162. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/write-test-code/agents/claude.yaml +0 -0
  163. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/write-test-code/agents/openai.yaml +0 -0
  164. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/write-test-code/references/anti-patterns.md +0 -0
  165. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/write-test-code/references/auth-patterns.md +0 -0
  166. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/write-test-code/references/mapping-tables.md +0 -0
  167. /package/dist/{2.0.8 → 2.0.10}/codex/plugin/skills/write-test-code/references/network-interception.md +0 -0
@@ -1,116 +0,0 @@
1
- ---
2
- name: plan-test-coverage
3
- description: >
4
- Use before writing specs or test code to decide what E2E coverage is needed first. It scans existing tests, inspects the target flow, finds coverage gaps, and produces a prioritized P0/P1/P2 plan with TC-IDs. Use it for requests like "what tests do I need", "coverage gaps", or "what TC-IDs are missing". It does not write detailed specs or executable tests.
5
- allowed-tools: Read Glob Grep Task mcp__browser__ping mcp__browser__navigate mcp__browser__find mcp__browser__get_element mcp__browser__get_form mcp__browser__get_field mcp__browser__click mcp__browser__type mcp__browser__press mcp__browser__select mcp__browser__hover mcp__browser__drag mcp__browser__scroll mcp__browser__scroll_to mcp__browser__wheel mcp__browser__snapshot mcp__browser__screenshot mcp__browser__go_back mcp__browser__go_forward mcp__browser__reload mcp__browser__list_pages mcp__browser__close_page
6
- ---
7
-
8
- # Plan Test Coverage
9
-
10
- Plan what E2E tests to write for a feature by analyzing existing test coverage and doing a quick site inspection.
11
-
12
- ## Workflow
13
-
14
- 1. **Parse input** — extract the target URL and feature area from: $ARGUMENTS
15
-
16
- 2. **Check existing test coverage**:
17
- - Search for existing test files related to the feature:
18
- ```
19
- Grep for feature keywords in **/*.spec.ts, **/*.test.ts
20
- ```
21
- - Identify what's already covered and what's missing
22
- - Note existing TC-IDs for the feature area to avoid conflicts
23
-
24
- 3. **Quick site inspection** (lightweight, not full exploration):
25
- - Follow the `agent-web-interface-guide` skill's browsing patterns (orient before acting, use `list_pages` for session awareness, close only pages you opened)
26
- - Navigate to the URL in a dedicated page
27
- - Use `find` to catalog the main interactive elements
28
- - Use `get_form` or `get_field` if the page has forms worth covering
29
- - Identify the key user flows visible on the page
30
- - Close only the page you opened when done; do not rely on a session-wide close
31
-
32
- 4. **Identify test categories** — for the feature, determine tests needed across:
33
- - **Critical path** — core happy path that must never break
34
- - **Input validation** — form fields, required fields, format constraints
35
- - **Error states** — network errors, server errors, empty states
36
- - **Edge cases** — boundary values, special characters, concurrent actions
37
- - **Cross-feature** — interactions with other features (e.g., auth + checkout)
38
- - **Accessibility** — keyboard navigation, screen reader support, focus management
39
- - **Visual regression** — layout consistency, responsive breakpoints (375px, 768px, 1280px)
40
- - **Performance** — loading states, lazy loading, large data sets
41
- - **Network errors** — server 500s, timeouts, offline behavior
42
-
43
- Not all categories apply to every project. Include Accessibility, Visual Regression, and Cross-Browser sections only when the project has explicit requirements, tooling, or configuration for them. Omit them from the output plan if not relevant — a focused plan is more useful than a padded one.
44
-
45
- 5. **Prioritize** — rank tests by:
46
- - **P0 (Must have)**: Core user journey, auth flows, data corruption prevention. Blocks revenue/signups if broken.
47
- - **P1 (Should have)**: Input validation, common error paths, accessibility basics (keyboard navigation, form labels)
48
- - **P2 (Nice to have)**: Edge cases, visual regression, performance scenarios, cross-browser specifics, rare error paths
49
-
50
- 6. **Output test plan**:
51
-
52
- ```markdown
53
- ## Test Coverage Plan: <Feature>
54
-
55
- **URL:** <url>
56
- **Date:** <date>
57
- **Existing coverage:** <N tests already exist / none>
58
-
59
- ### Already Covered
60
- - TC-FEATURE-001: <description> (in `tests/feature.spec.ts`)
61
- - ...
62
-
63
- ### Proposed New Tests
64
-
65
- #### P0 — Critical Path
66
- | TC-ID | Description | Why Critical |
67
- |-------|-------------|-------------|
68
- | TC-FEATURE-010 | Happy path: user completes full flow | Core revenue path |
69
-
70
- #### P1 — Validation & Errors
71
- | TC-ID | Description | Why Important |
72
- |-------|-------------|--------------|
73
- | TC-FEATURE-020 | Submit with empty required fields | Common user error |
74
-
75
- #### P2 — Edge Cases
76
- | TC-ID | Description | Notes |
77
- |-------|-------------|-------|
78
- | TC-FEATURE-030 | Special characters in search input | Unicode handling |
79
-
80
- #### Accessibility (include if project has accessibility requirements or WCAG compliance goals)
81
- | TC-ID | Description | WCAG Criterion |
82
- |-------|-------------|----------------|
83
- | TC-FEATURE-A01 | Keyboard-only navigation through flow | 2.1.1 Keyboard |
84
- | TC-FEATURE-A02 | Form errors announced to screen readers | 1.3.1 Info and Relationships |
85
-
86
- #### Visual Regression (if project has visual testing setup)
87
- | TC-ID | Description | Viewport |
88
- |-------|-------------|----------|
89
- | TC-FEATURE-V01 | Layout consistency at mobile width | 375x812 |
90
-
91
- #### Cross-Browser Matrix (include if project runs tests across multiple browsers)
92
- | Browser | Priority | Reason |
93
- |---------|----------|--------|
94
- | Chromium | P0 | Primary target |
95
- | Firefox | P1 | Second largest desktop share |
96
- | WebKit/Safari | P1 | Required for iOS users |
97
-
98
- ### Recommended Order
99
- 1. Write P0 tests first (N tests)
100
- 2. Then P1 validation + accessibility basics (N tests)
101
- 3. P2 edge cases, visual regression, and performance as time allows
102
-
103
- ### Next Steps
104
- - Invoke the `generate-test-cases` skill with the target URL and journey for detailed test specs
105
- - Invoke the `write-test-code` skill to implement the tests
106
- ```
107
-
108
- ## Example Usage
109
-
110
- ```
111
- Claude Code: /plan-test-coverage https://myapp.com/checkout Checkout flow
112
- Codex: $plan-test-coverage https://myapp.com/checkout Checkout flow
113
-
114
- Claude Code: /plan-test-coverage https://myapp.com/login Authentication
115
- Codex: $plan-test-coverage https://myapp.com/login Authentication
116
- ```
@@ -1,15 +0,0 @@
1
- {
2
- "name": "e2e-test-builder",
3
- "version": "2.0.8",
4
- "description": "Full-pipeline Playwright E2E test generation — explores your live site via browser, detects existing test conventions, plans coverage gaps, produces reviewed test specs, writes production-grade test code with quality gates, and stabilizes flaky tests",
5
- "author": {
6
- "name": "Athenaflow"
7
- },
8
- "skills": "./skills/",
9
- "interface": {
10
- "displayName": "E2E Test Builder",
11
- "shortDescription": "Full-pipeline Playwright E2E test generation with browser exploration and quality gates",
12
- "developerName": "Athenaflow",
13
- "category": "Testing"
14
- }
15
- }
@@ -1,9 +0,0 @@
1
- {
2
- "name": "@athenaflow/plugin-e2e-test-builder",
3
- "version": "2.0.8",
4
- "description": "Full-pipeline Playwright E2E test generation — explores your live site via browser, detects existing test conventions, plans coverage gaps, produces reviewed test specs, writes production-grade test code with quality gates, and stabilizes flaky tests",
5
- "license": "MIT",
6
- "publishConfig": {
7
- "access": "public"
8
- }
9
- }
@@ -1,217 +0,0 @@
1
- ---
2
- name: add-e2e-tests
3
- description: >
4
- THE DEFAULT ENTRY POINT for all Playwright / E2E test work. This skill should be used FIRST
5
- whenever the user wants to add, create, or set up end-to-end tests for any feature, page, or
6
- application. Runs the full pipeline: analyze codebase, explore the live site, plan coverage,
7
- generate TC-ID specs, run quality-gate reviews, write production-grade test code, and execute.
8
- Delegates to sub-skills (analyze-test-codebase, plan-test-coverage, generate-test-cases,
9
- review-test-cases, write-test-code, review-test-code, fix-flaky-tests) internally — do NOT
10
- skip to sub-skills directly unless the user explicitly requests a narrow activity.
11
- Iterative and resumable via tracker file. Uses subagent delegation to save context.
12
- user-invocable: true
13
- argument-hint: "<url> <feature to test>"
14
- allowed-tools: Read Write Edit Glob Grep Bash Task
15
- ---
16
-
17
- # Add E2E Tests
18
-
19
- Go from zero to passing Playwright tests for the target feature in one interactive session.
20
-
21
- ## Input
22
-
23
- Parse the target URL and feature description from: $ARGUMENTS
24
-
25
- Derive a **feature slug** from the feature description (e.g., "Login flow" → `login`, "Checkout with payment" → `checkout`). Use this slug for file naming throughout.
26
-
27
- ## Session Protocol
28
-
29
- ### 1. Orient: Understand the Project, the Product, and Your Capabilities
30
-
31
- Before planning any work, build deep situational awareness. This step determines the quality of everything that follows — rushed orientation leads to missed test cases and wasted effort.
32
-
33
- **Check for existing progress:**
34
- - If `e2e-tracker.md` exists in the project root, read it and resume from where you left off — skip to **step 2 (Plan)** with the remaining work.
35
- - If no tracker exists, this is a fresh start. Proceed with orientation below.
36
-
37
- #### First: create initial tasks and tracker
38
-
39
- As soon as you parse the user's request:
40
-
41
- 1. **Create the tracker** — write `e2e-tracker.md` with the goal (URL, feature, slug) and a skeleton plan.
42
- 2. **Create high-level tasks** for the work ahead — analyze codebase, explore the product, plan coverage, generate test specs, write tests, verify tests.
43
-
44
- These are your starting skeleton. As you work through orientation and discover the actual shape of the work, refine both the tasks and the tracker — break tasks into granular sub-tasks, add new ones, remove ones that don't apply.
45
-
46
- Treat the task list as a visible milestone log. Keep it concise, but update it continuously. Do not leave broad tasks open until the end and then mark everything complete in one batch.
47
-
48
- #### 1a. Understand the codebase
49
-
50
- - Does a Playwright config exist (`playwright.config.{ts,js,mjs}`)? If not, you will need to scaffold one (see Scaffolding section).
51
- - Are there existing tests? What conventions do they follow — naming, locators, fixtures, page objects, auth?
52
- - Load the `analyze-test-codebase` skill and follow its methodology.
53
-
54
- #### 1b. Understand the product
55
-
56
- This is the most important part of orientation. You cannot write good tests for a product you don't understand.
57
-
58
- - **Read existing test cases** — if `test-cases/*.md` files exist, read them to understand what journeys have been mapped. Look at what's covered AND what's missing.
59
- - **Browse the actual product** — load the `agent-web-interface-guide` skill and use the browser MCP tools to walk through the feature you're testing. Don't just skim the page — interact with it as a user would: fill forms, click buttons, trigger validation, navigate between pages, check error states.
60
- - **Map the user journey in detail** — understand the complete flow: entry points, happy paths, error paths, edge cases, what happens with invalid input, what happens when the user goes back, what conditional UI exists.
61
-
62
- Why this matters: absent explicit exploration, agents tend to write tests based on assumptions about how a product works rather than how it actually works. The result is tests that target imaginary behavior or miss critical real behavior. Spending time here prevents both.
63
-
64
- #### 1c. Know your skills
65
-
66
- You have access to specialized skills that contain deep domain knowledge. Load the relevant skill before performing each activity — skills prevent improvisation and encode best practices.
67
-
68
- | Activity | Skill |
69
- |----------|-------|
70
- | Analyzing test setup, config, conventions | `analyze-test-codebase` |
71
- | Deciding what to test, coverage gaps, priorities | `plan-test-coverage` |
72
- | Opening a URL, browsing, using browser MCP tools | `agent-web-interface-guide` |
73
- | Creating TC-ID specs from site exploration | `generate-test-cases` |
74
- | Reviewing TC-ID specs before implementation | `review-test-cases` |
75
- | Writing, editing, or refactoring test code | `write-test-code` |
76
- | Reviewing test code before execution signoff | `review-test-code` |
77
- | Debugging test failures, checking stability | `fix-flaky-tests` |
78
-
79
- Before doing a substantial activity, load the skill that covers that activity so you can follow its workflow rather than improvising.
80
-
81
- #### 1d. Update the tracker with orientation findings
82
-
83
- After orienting, update the tracker with what you learned about the codebase and product, conventions discovered, and your refined plan. The tracker must always answer these four questions for anyone reading it cold:
84
-
85
- 1. What is the goal?
86
- 2. What has been done?
87
- 3. What is remaining?
88
- 4. What should I do next?
89
-
90
- See [references/tracker-template.md](references/tracker-template.md) for a concrete template.
91
-
92
- ### 2. Plan: Refine Tasks Into Granular Checkpoints
93
-
94
- By now you have initial tasks and a tracker from step 1. Refine tasks into granular checkpoints. The plan should flow from what you learned during orientation, not from a fixed template.
95
-
96
- #### Task granularity
97
-
98
- Think in small checkpoints, not big phases. Each task should represent a concrete, verifiable unit of progress.
99
-
100
- Too coarse: "Analyze codebase", "Write tests", "Verify tests"
101
-
102
- Right granularity:
103
- - "Read playwright.config.ts — extract baseURL, testDir, projects"
104
- - "Read 2 existing test files — identify locator strategy and naming pattern"
105
- - "Write conventions report to e2e-plan/conventions.md"
106
- - "Navigate to /login — catalog all form fields, buttons, and validation messages"
107
- - "Submit login form empty — record all validation error messages and their positions"
108
- - "Submit login with invalid email format — record inline validation behavior"
109
- - "Write TC-LOGIN-001: happy path login with valid credentials"
110
- - "Write TC-LOGIN-002: login with empty email shows required field error"
111
- - "Run login.spec.ts and record full output"
112
- - "Fix TC-LOGIN-003: selector not found — browse page and re-extract selector"
113
- - "Re-run login.spec.ts — verify fix didn't break other tests"
114
- - "Check all TC-IDs from spec are present in test files"
115
-
116
- **Never be conservative.** More tasks is better than fewer. If you discover new work mid-session (a test fails, a selector changed, a form has unexpected validation), add tasks dynamically. The task list is a living document that reflects the real state of the work.
117
-
118
- Create tasks for verification steps too (running tests, checking coverage, browsing to confirm selectors), not just implementation.
119
-
120
- Update task status as each checkpoint completes. A good pattern is: finish exploration and mark it complete, finish coverage/spec work and mark it complete, finish implementation and mark it complete, then finish review/execution and mark it complete. Do not keep all milestones open until session end.
121
-
122
- ### 3. Execute
123
-
124
- Work through your tasks. Load the relevant skill before each activity.
125
-
126
- #### Planning uses the browser heavily
127
-
128
- When planning what to test (coverage planning, test case generation), use the browser extensively. Don't just catalog elements — interact with the product to discover:
129
- - What validation messages appear for each field?
130
- - What happens when you submit with missing data?
131
- - What error states exist (network errors, empty states, permission errors)?
132
- - What does the flow look like end-to-end, not just page-by-page?
133
- - What edge cases exist (special characters, long inputs, rapid clicks)?
134
- - What UI changes conditionally (loading states, disabled buttons, progressive disclosure)?
135
-
136
- Every test case you generate should trace back to something you actually observed or deliberately triggered in the browser. This is how you avoid introducing useless test cases (testing imaginary behavior) and avoid missing important ones (behavior you didn't think to check).
137
-
138
- #### Subagent delegation
139
-
140
- Delegate heavy browser exploration and test writing to subagents when that saves context for orchestration, verification, and debugging. When delegating:
141
- - Pass the relevant file paths (conventions, coverage plan, test specs)
142
- - Instruct the subagent to invoke the appropriate skill (subagents inherit access to plugin skills)
143
- - Specify concrete output expectations (file path, format, TC-ID conventions)
144
-
145
- #### Quality gates
146
-
147
- Two review gates and a test execution checkpoint are mandatory during execution. The review gates are review-only — they produce findings but do not modify files.
148
-
149
- **Gate 1: Review test case specs** (after `generate-test-cases`, before `write-test-code`)
150
- 1. Load the `review-test-cases` skill and run it against `test-cases/<feature>.md`
151
- 2. If verdict is **NEEDS REVISION** — address all blockers in the spec before proceeding to implementation
152
- 3. If verdict is **PASS WITH WARNINGS** — address warnings if quick, otherwise note them and proceed
153
- 4. Record the review verdict in the tracker
154
-
155
- **Gate 2: Review test code** (after `write-test-code`, before final test execution)
156
- 1. Load the `review-test-code` skill and run it against the implemented test files
157
- 2. If verdict is **NEEDS REVISION** — fix all blockers before running tests for signoff
158
- 3. If verdict is **PASS WITH WARNINGS** — fix warnings that affect stability, proceed with execution
159
- 4. Record the review verdict in the tracker
160
-
161
- **Checkpoint: Test execution**
162
- 1. Run the tests: `npx playwright test <file> --reporter=list 2>&1`
163
- 2. Record full output — green test output is the only proof of correctness
164
- 3. If tests fail, load the `fix-flaky-tests` skill and follow its structured diagnostic approach. Do not guess-and-retry.
165
- 4. Maximum 3 fix-and-rerun cycles per test. If stuck after 3 cycles, record the diagnostic output in the tracker and move on.
166
-
167
- **Test execution and coverage checks must never be delegated to subagents.** Run `npx playwright test` directly and record the output.
168
-
169
- #### Update the tracker as you work
170
-
171
- Do not wait until session end. After each meaningful chunk of progress (completing a step, discovering a blocker, producing an artifact), update the tracker. If your context window resets, only what's in the tracker survives.
172
-
173
- Keep the tracker and task list synchronized. If you record progress in the tracker, update the corresponding task status in the same phase of work.
174
-
175
- #### Error recovery
176
-
177
- If infrastructure failures occur (browser MCP unavailable, clone failures, npm install errors), see [references/error-recovery.md](references/error-recovery.md) for diagnostic steps. General pattern: diagnose, attempt one known fix, if still stuck record in tracker and ask the user.
178
-
179
- ### 4. End of Session
180
-
181
- Before exiting:
182
- 1. Ensure the tracker reflects all progress, discoveries, and blockers from this session
183
- 2. Write clear instructions for what the next session should do
184
- 3. If all work is complete and all tests pass with full TC-ID coverage: write `<!-- E2E_COMPLETE -->` as the last line of the tracker
185
- 4. If an unrecoverable blocker prevents progress: write `<!-- E2E_BLOCKED: reason -->` as the last line
186
-
187
- Do not write terminal markers prematurely. Only after you are confident the work is truly done or truly stuck.
188
-
189
- ## Scaffolding
190
-
191
- If Playwright is not set up in the target project, follow the procedure in [references/scaffolding.md](references/scaffolding.md) to clone the boilerplate, merge configuration, and install dependencies. Log all scaffolding steps in the tracker.
192
-
193
- ## Authentication
194
-
195
- If the target feature requires login, follow [references/authentication.md](references/authentication.md). Key rule: never hardcode credentials — use environment variables or `storageState`.
196
-
197
- ## Principles
198
-
199
- - **Skills carry the knowledge** — load the relevant skill before each activity; do not improvise
200
- - **Subagent-driven** — delegate heavy browser and writing work to subagents to save context
201
- - **Follow existing conventions** — match the project's test style, not a generic template
202
- - **Traceable** — every test links back to a TC-ID from the spec
203
- - **Use what the project provides** — if the scaffolded boilerplate includes Page Object Models (BasePage, pages/), path aliases (tsconfig paths), or utility modules, USE them in generated tests. Do not ship infrastructure that tests ignore. If a boilerplate file is unused after test generation, either integrate it or remove it — dead code in test infrastructure causes confusion.
204
- - **No arbitrary waits** — use Playwright's built-in auto-wait and event-driven waits
205
- - **API before UI for setup** — use API calls (`request` fixture) for test data; reserve UI for what you are verifying
206
- - **Test failures, not just success** — every feature needs error path coverage
207
- - **Artifacts live in standard locations** — `e2e-plan/` for analysis, `test-cases/` for specs, project test dir for test files
208
-
209
- ## Example Usage
210
-
211
- ```
212
- Claude Code: /add-e2e-tests https://myapp.com/checkout Checkout flow with cart, shipping, and payment
213
- Codex: $add-e2e-tests https://myapp.com/checkout Checkout flow with cart, shipping, and payment
214
-
215
- Claude Code: /add-e2e-tests https://myapp.com/login User authentication including social login
216
- Codex: $add-e2e-tests https://myapp.com/login User authentication including social login
217
- ```
@@ -1,43 +0,0 @@
1
- # Error Recovery for Infrastructure Failures
2
-
3
- When infrastructure failures occur during E2E test building, follow the general pattern: diagnose, attempt one known fix, if still stuck record in tracker and ask the user.
4
-
5
- ## Browser MCP unavailable
6
-
7
- The browser MCP server (`agent-web-interface`) must be running for site exploration.
8
-
9
- 1. Verify the MCP server is configured in the project (check `.mcp.json` or plugin config).
10
- 2. Ask the user to confirm the MCP server is running or to restart it.
11
- 3. If unreachable after user intervention, mark the session as blocked: `<!-- E2E_BLOCKED: browser MCP server unreachable -->`.
12
-
13
- ## Boilerplate clone fails
14
-
15
- When `git clone git@github.com:lespaceman/playwright-typescript-e2e-boilerplate.git` fails:
16
-
17
- 1. Check if SSH keys are configured — the error will usually indicate `Permission denied (publickey)`.
18
- 2. Fall back to HTTPS: `git clone https://github.com/lespaceman/playwright-typescript-e2e-boilerplate.git`.
19
- 3. If both fail, ask the user to verify network access and GitHub authentication.
20
-
21
- ## npm install fails
22
-
23
- 1. Check the Node.js version — Playwright requires Node 18+. Run `node --version`.
24
- 2. Clear the npm cache: `npm cache clean --force`.
25
- 3. Check for lockfile conflicts — if both `package-lock.json` and `yarn.lock` exist, ask the user which package manager to use.
26
- 4. If dependency resolution fails, try `npm install --legacy-peer-deps` as a fallback.
27
-
28
- ## Playwright browser install fails
29
-
30
- When `npx playwright install --with-deps chromium` fails:
31
-
32
- 1. Try installing just chromium without system deps: `npx playwright install chromium`.
33
- 2. Check permissions — on Linux, system dependency installation may require sudo.
34
- 3. If behind a corporate proxy, ask the user for proxy configuration.
35
- 4. As a last resort, ask the user to run the install command manually and confirm when done.
36
-
37
- ## General pattern
38
-
39
- For any infrastructure failure not listed above:
40
-
41
- 1. **Diagnose** — read the error message carefully, check logs, identify the root cause.
42
- 2. **Attempt one known fix** — apply the most likely solution based on the error.
43
- 3. **If still stuck** — record the full error output and diagnostic steps taken in the tracker, then ask the user for help. Do not loop through multiple speculative fixes.
@@ -1,53 +0,0 @@
1
- # Tracker Template: e2e-tracker.md
2
-
3
- Use this as a starting template when creating the tracker file. Adapt sections as needed.
4
-
5
- ---
6
-
7
- ```markdown
8
- # E2E Test Tracker
9
-
10
- ## Goal
11
-
12
- - **URL:** https://myapp.com/checkout
13
- - **Feature:** Checkout flow with cart, shipping, and payment
14
- - **Slug:** checkout
15
-
16
- ## Progress
17
-
18
- ### Session 1 — 2026-04-01
19
-
20
- - Analyzed codebase: Playwright 1.42, TypeScript, existing auth fixture in `tests/fixtures/auth.ts`
21
- - Browsed checkout flow: 4-step wizard (cart review, shipping, payment, confirmation)
22
- - Discovered 3 form validation states per step
23
- - Generated test spec: `test-cases/checkout.md` with TC-CHECKOUT-001 through TC-CHECKOUT-012
24
- - Review gate passed with warnings (TC-CHECKOUT-008 needs clarification on coupon edge case)
25
-
26
- ### Session 2 — 2026-04-02
27
-
28
- - Wrote `tests/checkout.spec.ts` covering TC-CHECKOUT-001 through TC-CHECKOUT-012
29
- - Code review gate passed
30
- - Ran tests: 10/12 passing, 2 failing (TC-CHECKOUT-006: selector stale, TC-CHECKOUT-011: timeout)
31
- - Fixed TC-CHECKOUT-006: updated selector from `.price` to `[data-testid="cart-total"]`
32
- - Fixed TC-CHECKOUT-011: added `waitForResponse` before assertion
33
-
34
- ## Remaining
35
-
36
- - Re-run full suite to confirm fixes
37
- - Verify TC-ID coverage: all 12 specs should map to test code
38
-
39
- ## Next Steps
40
-
41
- - Run `npx playwright test tests/checkout.spec.ts --reporter=list` and record output
42
- - If all green, mark complete
43
- - If failures remain, diagnose with `fix-flaky-tests` skill
44
-
45
- <!-- E2E_COMPLETE -->
46
- ```
47
-
48
- ---
49
-
50
- **Terminal markers** (write as the last line of the tracker when appropriate):
51
-
52
- - `<!-- E2E_COMPLETE -->` — all tests pass, all TC-IDs covered, work is done
53
- - `<!-- E2E_BLOCKED: reason -->` — unrecoverable blocker prevents further progress (e.g., `<!-- E2E_BLOCKED: login requires 2FA token we cannot automate -->`)
@@ -1,160 +0,0 @@
1
- ---
2
- name: fix-flaky-tests
3
- description: >
4
- This skill should be used when a Playwright test is failing, flaky, timing out, or behaving
5
- inconsistently. It provides structured root cause analysis for: stabilizing intermittent tests,
6
- debugging timeouts ("Test timeout of 30000ms exceeded"), fixing race conditions, investigating
7
- local-vs-CI divergence, running repeated stability checks (--repeat-each).
8
- IMPORTANT: If running tests with --repeat-each, --retries, or multiple times to check stability,
9
- STOP and load this skill first — it has structured root cause analysis that prevents brute-force
10
- approaches. Triggers: "stabilize", "intermittent", "flaky", "keeps failing", "fails in CI",
11
- "timeout on", "race condition", "run N times to check stability", "verify tests are stable".
12
- NOT for writing new tests (use write-test-code) or analyzing setup (use analyze-test-codebase).
13
- allowed-tools: Read Write Edit Bash Glob Grep Task
14
- ---
15
-
16
- # Fix Flaky Tests
17
-
18
- Systematically diagnose and fix intermittent Playwright test failures using root cause analysis. A flaky test is worse than no test — it trains teams to ignore failures.
19
-
20
- ## Input
21
-
22
- Parse the test file path or test name from: $ARGUMENTS
23
-
24
- If no argument provided, ask: "Which test file or test name is flaky?"
25
-
26
- ## Workflow
27
-
28
- ### Step 1: Reproduce and Classify
29
-
30
- 1. **Read the test file** to understand what it tests and how
31
- 2. **Run the test multiple times** to observe the failure pattern:
32
- ```bash
33
- npx playwright test <file> --repeat-each=5 --reporter=list 2>&1
34
- ```
35
- 3. **Run in isolation** if it passed above — it may only fail with other tests:
36
- ```bash
37
- npx playwright test --reporter=list 2>&1
38
- ```
39
- 4. **Classify the failure** into one of these root cause categories:
40
-
41
- | Category | Symptoms |
42
- |----------|----------|
43
- | **Timing** | Timeout errors, "element not found", "not visible yet" |
44
- | **State leakage** | Passes alone, fails when run with other tests |
45
- | **Data dependency** | Fails when expected data doesn't exist or has changed |
46
- | **Race condition** | Action fires before page is ready (hydration, animation) |
47
- | **Selector fragility** | Element found but wrong one, or `.first()` picks different element |
48
- | **Environment** | Passes locally, fails in CI (viewport, speed, resources) |
49
-
50
- ### Step 2: Root Cause Analysis
51
-
52
- Investigate based on the classification:
53
-
54
- **Timing issues:**
55
- - Look for assertions immediately after actions with no wait for the resulting state change
56
- - Check if the test asserts before an API response arrives — search for missing `waitForResponse`
57
- - Look for animations/transitions that affect element state (CSS transitions, skeleton screens)
58
- - Check for `waitForTimeout` being used as a "fix" — this is a symptom, not a cure
59
- - Check if `networkidle` or `load` waitUntil would help for navigation
60
-
61
- **State leakage:**
62
- - Run the failing test alone: `npx playwright test --grep "<test name>"`
63
- - Check if tests share mutable state: global variables, database rows, cookies, localStorage
64
- - Look for missing cleanup in `afterEach`/`afterAll`
65
- - Check if `storageState` bleeds between tests or test files
66
- - Check for test data created by one test that another test depends on
67
-
68
- **Race conditions:**
69
- - Identify the race: what two things are happening concurrently?
70
- - Check for click handlers that fire before JavaScript hydration completes
71
- - Look for optimistic UI updates that revert on API response
72
- - Check for actions during navigation transitions (click during page load)
73
- - Look for double-clicks or rapid interactions that trigger duplicate actions
74
-
75
- **Selector fragility:**
76
- - Navigate to the page in the browser and verify the selector currently matches the intended element
77
- - Check if the selector matches multiple elements — `.first()` or `.nth()` is a smell
78
- - Look for dynamically generated IDs, classes, or attributes
79
- - Check for conditional rendering that changes element order or presence
80
- - Verify locators against current DOM structure using `find` and `get_element`
81
-
82
- **Environment issues:**
83
- - Compare CI viewport size vs local — element may be off-screen in CI
84
- - Check for timezone-dependent assertions (dates, timestamps)
85
- - Check for locale-dependent formatting (numbers, currency)
86
- - Check if CI has slower network/CPU affecting timing
87
- - Look for third-party scripts (analytics, chat widgets) that load differently in CI
88
-
89
- ### Step 3: Apply the Correct Fix
90
-
91
- Use the right fix pattern for the diagnosed root cause. **Never apply a fix without understanding the cause.** See [references/fix-patterns.md](references/fix-patterns.md) for full code examples.
92
-
93
- | Category | Principle |
94
- |----------|-----------|
95
- | **Timing** | Replace sleeps with event-driven waits (`waitForResponse`, auto-retrying assertions) |
96
- | **State isolation** | Unique data per test, API-based reset in `beforeEach`, no shared mutable state |
97
- | **Race condition** | Use `Promise.all` for action + expected response; wait for hydration before interaction |
98
- | **Selector** | Scope locators to containers with unique content; avoid `.first()` and position-dependent selectors |
99
- | **Environment** | Explicit viewport, timezone-agnostic assertions, block interfering third-party scripts |
100
-
101
- ### Step 4: Verify the Fix
102
-
103
- 1. **Run the test 5+ times** to confirm stability:
104
- ```bash
105
- npx playwright test <file> --repeat-each=5 --reporter=list 2>&1
106
- ```
107
- 2. **Run with the full test suite** to verify no state leakage:
108
- ```bash
109
- npx playwright test --reporter=list 2>&1
110
- ```
111
- 3. If still flaky → return to Step 2 with the new failure output. The initial classification may have been wrong.
112
- 4. **Maximum 3 fix-and-rerun cycles.** If the test is still flaky after 3 attempts, stop and report the diagnostic findings (root cause hypothesis, fixes attempted, remaining failure output) so the user can decide next steps. Do not continue looping.
113
-
114
- ### Step 5: Summarize
115
-
116
- Report:
117
- 1. **Root cause** — what made the test flaky and why
118
- 2. **Fix applied** — what changed and why this fix addresses the root cause
119
- 3. **Verification** — how many consecutive runs passed
120
- 4. **Prevention** — what pattern to follow in future tests to avoid this class of flakiness
121
-
122
- ## Flakiness Checklist (Less Obvious Causes)
123
-
124
- When the standard categories don't fit, check these:
125
-
126
- - [ ] **Viewport size** — element off-screen in CI (smaller viewport)
127
- - [ ] **Font rendering** — text matching fails due to font differences across OS
128
- - [ ] **Timezone** — date/time assertions fail in different timezones
129
- - [ ] **Locale** — number/currency formatting differs (1,000 vs 1.000)
130
- - [ ] **Third-party scripts** — analytics/chat widgets change DOM or block clicks
131
- - [ ] **Cookie consent banners** — overlay blocks click targets
132
- - [ ] **Feature flags** — different features enabled in different environments
133
- - [ ] **Database state** — shared test database with stale or conflicting data
134
- - [ ] **Parallel execution** — tests interfere when run in parallel workers
135
- - [ ] **Browser caching** — cached responses differ from fresh ones
136
- - [ ] **Service workers** — intercepting requests differently than expected
137
- - [ ] **Lazy loading** — elements not yet in DOM when test tries to interact
138
-
139
- ## Anti-Patterns: What is NOT a Fix
140
-
141
- These mask the problem. Never apply them without a real fix:
142
-
143
- | "Fix" | Why It's Wrong | Real Fix |
144
- |-------|---------------|----------|
145
- | `waitForTimeout(3000)` | Hides timing race, will break under load | Wait for the specific event |
146
- | `.first()` added | Hides selector ambiguity | Narrow the selector |
147
- | Increased timeout to 30s | Hides missing wait or slow setup | Find what you're actually waiting for |
148
- | `test.skip()` | Ignoring the problem | Diagnose and fix |
149
- | `retries: 3` without fix | Masks real failures, wastes CI time | Fix the root cause, then keep retries as safety net |
150
- | `{ force: true }` | Bypasses actionability checks, hides overlapping elements or disabled state | Find and fix the actionability issue: wait for overlay to disappear, scroll element into view, or wait for enabled state |
151
- | `try/catch` swallowing errors | Test passes but doesn't verify anything | Fix the assertion |
152
-
153
- ## Multiple Flaky Tests
154
-
155
- When a suite has several flaky tests:
156
-
157
- 1. **Triage first.** Run the full suite once and group failures by root cause category (timing, state leakage, etc.). Shared root causes (broken fixture, leaking state) should be fixed once, not per-test.
158
- 2. **Fix shared infrastructure issues first.** A bad `beforeEach`, a leaking `storageState`, or a missing cleanup can cause many tests to fail. One fix resolves many failures.
159
- 3. **Split independent fixes across subagents** when the fix scopes do not overlap (different test files, no shared fixtures). Pass each subagent the test file path, this diagnostic workflow, and the root cause classification table.
160
- 4. The 3 fix-and-rerun cycle limit applies **per test**, not globally.