@brunosps00/dev-workflow 0.11.0 → 0.15.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (127) hide show
  1. package/README.md +54 -5
  2. package/lib/constants.js +20 -20
  3. package/lib/init.js +24 -1
  4. package/lib/migrate-skills.js +129 -0
  5. package/lib/removed-bundled-skills.js +16 -0
  6. package/lib/uninstall.js +6 -2
  7. package/lib/utils.js +43 -1
  8. package/package.json +1 -1
  9. package/scaffold/en/agent-instructions.md +68 -0
  10. package/scaffold/en/commands/dw-autopilot.md +1 -1
  11. package/scaffold/en/commands/dw-brainstorm.md +1 -1
  12. package/scaffold/en/commands/dw-bugfix.md +4 -3
  13. package/scaffold/en/commands/dw-code-review.md +1 -0
  14. package/scaffold/en/commands/dw-create-tasks.md +6 -0
  15. package/scaffold/en/commands/dw-create-techspec.md +1 -1
  16. package/scaffold/en/commands/dw-deps-audit.md +1 -1
  17. package/scaffold/en/commands/dw-fix-qa.md +1 -1
  18. package/scaffold/en/commands/dw-functional-doc.md +2 -2
  19. package/scaffold/en/commands/dw-help.md +2 -2
  20. package/scaffold/en/commands/dw-redesign-ui.md +7 -7
  21. package/scaffold/en/commands/dw-run-qa.md +5 -4
  22. package/scaffold/en/commands/dw-run-task.md +2 -2
  23. package/scaffold/en/templates/constitution-template.md +1 -1
  24. package/scaffold/pt-br/agent-instructions.md +68 -0
  25. package/scaffold/pt-br/commands/dw-autopilot.md +1 -1
  26. package/scaffold/pt-br/commands/dw-brainstorm.md +1 -1
  27. package/scaffold/pt-br/commands/dw-bugfix.md +4 -3
  28. package/scaffold/pt-br/commands/dw-code-review.md +1 -0
  29. package/scaffold/pt-br/commands/dw-create-tasks.md +6 -0
  30. package/scaffold/pt-br/commands/dw-create-techspec.md +1 -1
  31. package/scaffold/pt-br/commands/dw-deps-audit.md +1 -1
  32. package/scaffold/pt-br/commands/dw-fix-qa.md +1 -1
  33. package/scaffold/pt-br/commands/dw-functional-doc.md +2 -2
  34. package/scaffold/pt-br/commands/dw-help.md +2 -2
  35. package/scaffold/pt-br/commands/dw-redesign-ui.md +7 -7
  36. package/scaffold/pt-br/commands/dw-run-qa.md +5 -4
  37. package/scaffold/pt-br/commands/dw-run-task.md +2 -2
  38. package/scaffold/pt-br/templates/constitution-template.md +1 -1
  39. package/scaffold/skills/dw-council/SKILL.md +1 -1
  40. package/scaffold/skills/dw-incident-response/SKILL.md +164 -0
  41. package/scaffold/skills/dw-incident-response/references/blameless-discipline.md +126 -0
  42. package/scaffold/skills/dw-incident-response/references/communication-templates.md +107 -0
  43. package/scaffold/skills/dw-incident-response/references/postmortem-template.md +133 -0
  44. package/scaffold/skills/dw-incident-response/references/runbook-templates.md +169 -0
  45. package/scaffold/skills/dw-incident-response/references/severity-and-triage.md +186 -0
  46. package/scaffold/skills/dw-llm-eval/SKILL.md +148 -0
  47. package/scaffold/skills/dw-llm-eval/references/agent-eval.md +252 -0
  48. package/scaffold/skills/dw-llm-eval/references/judge-calibration.md +169 -0
  49. package/scaffold/skills/dw-llm-eval/references/oracle-ladder.md +171 -0
  50. package/scaffold/skills/dw-llm-eval/references/rag-metrics.md +186 -0
  51. package/scaffold/skills/dw-llm-eval/references/reference-dataset.md +190 -0
  52. package/scaffold/skills/dw-testing-discipline/SKILL.md +171 -0
  53. package/scaffold/skills/dw-testing-discipline/references/agent-guardrails.md +170 -0
  54. package/scaffold/skills/dw-testing-discipline/references/anti-patterns.md +336 -0
  55. package/scaffold/skills/dw-testing-discipline/references/core-rules.md +128 -0
  56. package/scaffold/skills/dw-testing-discipline/references/flaky-discipline.md +163 -0
  57. package/scaffold/skills/dw-testing-discipline/references/patterns.md +241 -0
  58. package/scaffold/skills/dw-testing-discipline/references/playwright-recipes.md +282 -0
  59. package/scaffold/skills/{webapp-testing → dw-testing-discipline}/references/security-boundary.md +1 -1
  60. package/scaffold/skills/dw-ui-discipline/SKILL.md +150 -0
  61. package/scaffold/skills/dw-ui-discipline/references/accessibility-floor.md +225 -0
  62. package/scaffold/skills/dw-ui-discipline/references/curated-defaults.md +195 -0
  63. package/scaffold/skills/dw-ui-discipline/references/hard-gate.md +162 -0
  64. package/scaffold/skills/dw-ui-discipline/references/state-matrix.md +101 -0
  65. package/scaffold/skills/dw-ui-discipline/references/visual-slop.md +152 -0
  66. package/scaffold/skills/ui-ux-pro-max/LICENSE +0 -21
  67. package/scaffold/skills/ui-ux-pro-max/SKILL.md +0 -659
  68. package/scaffold/skills/ui-ux-pro-max/data/_sync_all.py +0 -414
  69. package/scaffold/skills/ui-ux-pro-max/data/app-interface.csv +0 -31
  70. package/scaffold/skills/ui-ux-pro-max/data/charts.csv +0 -26
  71. package/scaffold/skills/ui-ux-pro-max/data/colors.csv +0 -162
  72. package/scaffold/skills/ui-ux-pro-max/data/design.csv +0 -1776
  73. package/scaffold/skills/ui-ux-pro-max/data/draft.csv +0 -1779
  74. package/scaffold/skills/ui-ux-pro-max/data/google-fonts.csv +0 -1924
  75. package/scaffold/skills/ui-ux-pro-max/data/icons.csv +0 -106
  76. package/scaffold/skills/ui-ux-pro-max/data/landing.csv +0 -35
  77. package/scaffold/skills/ui-ux-pro-max/data/products.csv +0 -162
  78. package/scaffold/skills/ui-ux-pro-max/data/react-performance.csv +0 -45
  79. package/scaffold/skills/ui-ux-pro-max/data/stacks/angular.csv +0 -51
  80. package/scaffold/skills/ui-ux-pro-max/data/stacks/astro.csv +0 -54
  81. package/scaffold/skills/ui-ux-pro-max/data/stacks/flutter.csv +0 -53
  82. package/scaffold/skills/ui-ux-pro-max/data/stacks/html-tailwind.csv +0 -56
  83. package/scaffold/skills/ui-ux-pro-max/data/stacks/jetpack-compose.csv +0 -53
  84. package/scaffold/skills/ui-ux-pro-max/data/stacks/laravel.csv +0 -51
  85. package/scaffold/skills/ui-ux-pro-max/data/stacks/nextjs.csv +0 -53
  86. package/scaffold/skills/ui-ux-pro-max/data/stacks/nuxt-ui.csv +0 -51
  87. package/scaffold/skills/ui-ux-pro-max/data/stacks/nuxtjs.csv +0 -59
  88. package/scaffold/skills/ui-ux-pro-max/data/stacks/react-native.csv +0 -52
  89. package/scaffold/skills/ui-ux-pro-max/data/stacks/react.csv +0 -54
  90. package/scaffold/skills/ui-ux-pro-max/data/stacks/shadcn.csv +0 -61
  91. package/scaffold/skills/ui-ux-pro-max/data/stacks/svelte.csv +0 -54
  92. package/scaffold/skills/ui-ux-pro-max/data/stacks/swiftui.csv +0 -51
  93. package/scaffold/skills/ui-ux-pro-max/data/stacks/threejs.csv +0 -54
  94. package/scaffold/skills/ui-ux-pro-max/data/stacks/vue.csv +0 -50
  95. package/scaffold/skills/ui-ux-pro-max/data/styles.csv +0 -85
  96. package/scaffold/skills/ui-ux-pro-max/data/typography.csv +0 -74
  97. package/scaffold/skills/ui-ux-pro-max/data/ui-reasoning.csv +0 -162
  98. package/scaffold/skills/ui-ux-pro-max/data/ux-guidelines.csv +0 -100
  99. package/scaffold/skills/ui-ux-pro-max/scripts/core.py +0 -262
  100. package/scaffold/skills/ui-ux-pro-max/scripts/design_system.py +0 -1148
  101. package/scaffold/skills/ui-ux-pro-max/scripts/search.py +0 -114
  102. package/scaffold/skills/ui-ux-pro-max/skills/brand/SKILL.md +0 -97
  103. package/scaffold/skills/ui-ux-pro-max/skills/design/SKILL.md +0 -302
  104. package/scaffold/skills/ui-ux-pro-max/skills/design-system/SKILL.md +0 -244
  105. package/scaffold/skills/ui-ux-pro-max/templates/base/quick-reference.md +0 -297
  106. package/scaffold/skills/ui-ux-pro-max/templates/base/skill-content.md +0 -358
  107. package/scaffold/skills/ui-ux-pro-max/templates/platforms/agent.json +0 -21
  108. package/scaffold/skills/ui-ux-pro-max/templates/platforms/augment.json +0 -18
  109. package/scaffold/skills/ui-ux-pro-max/templates/platforms/claude.json +0 -21
  110. package/scaffold/skills/ui-ux-pro-max/templates/platforms/codebuddy.json +0 -21
  111. package/scaffold/skills/ui-ux-pro-max/templates/platforms/codex.json +0 -21
  112. package/scaffold/skills/ui-ux-pro-max/templates/platforms/continue.json +0 -21
  113. package/scaffold/skills/ui-ux-pro-max/templates/platforms/copilot.json +0 -21
  114. package/scaffold/skills/ui-ux-pro-max/templates/platforms/cursor.json +0 -21
  115. package/scaffold/skills/ui-ux-pro-max/templates/platforms/droid.json +0 -21
  116. package/scaffold/skills/ui-ux-pro-max/templates/platforms/gemini.json +0 -21
  117. package/scaffold/skills/ui-ux-pro-max/templates/platforms/kilocode.json +0 -21
  118. package/scaffold/skills/ui-ux-pro-max/templates/platforms/kiro.json +0 -21
  119. package/scaffold/skills/ui-ux-pro-max/templates/platforms/opencode.json +0 -21
  120. package/scaffold/skills/ui-ux-pro-max/templates/platforms/qoder.json +0 -21
  121. package/scaffold/skills/ui-ux-pro-max/templates/platforms/roocode.json +0 -21
  122. package/scaffold/skills/ui-ux-pro-max/templates/platforms/trae.json +0 -21
  123. package/scaffold/skills/ui-ux-pro-max/templates/platforms/warp.json +0 -18
  124. package/scaffold/skills/ui-ux-pro-max/templates/platforms/windsurf.json +0 -21
  125. package/scaffold/skills/webapp-testing/SKILL.md +0 -138
  126. package/scaffold/skills/webapp-testing/assets/test-helper.js +0 -56
  127. /package/scaffold/skills/{webapp-testing → dw-testing-discipline}/references/three-workflow-patterns.md +0 -0
@@ -0,0 +1,163 @@
1
+ # Flaky discipline — taxonomy, quarantine, SLOs
2
+
3
+ A flaky test is one that produces different verdicts (pass/fail) on the same code across runs. They corrode trust in the suite faster than any other category of test debt.
4
+
5
+ ## The four root causes (in order of frequency)
6
+
7
+ ### Cause 1: Race conditions (concurrency)
8
+
9
+ **Tells:**
10
+ - Test passes locally, fails in CI (or vice versa).
11
+ - Failure rate correlates with CI machine load.
12
+ - Adding `await page.waitForTimeout(100)` "fixes" it.
13
+
14
+ **Common scenarios:**
15
+ - Async operation completes after test moves on (missing `await`).
16
+ - Two requests sent simultaneously, response order matters.
17
+ - DOM update happens after assertion runs.
18
+ - Database write not yet committed when read fires.
19
+
20
+ **Fix:**
21
+ - Replace wall-clock waits with condition-based waits (`waitFor`, `toBeVisible`, `expect.poll`).
22
+ - Add proper `await` on every async operation.
23
+ - Use transaction boundaries explicitly when test reads its own write.
24
+
25
+ ### Cause 2: Test order dependency
26
+
27
+ **Tells:**
28
+ - Test passes when suite runs in order, fails with `--shuffle`.
29
+ - Test fails when run with `.only` in isolation.
30
+ - Failures cluster on first run after CI restart but not afterwards.
31
+
32
+ **Common scenarios:**
33
+ - `beforeAll` populates shared state; second test mutates it; third test fails.
34
+ - Test A creates a global mock; Test B inherits it unexpectedly.
35
+ - Database row persists across tests because cleanup is in `afterEach` but a test threw mid-execution.
36
+
37
+ **Fix:**
38
+ - Move state creation from `beforeAll` to `beforeEach`.
39
+ - Reset shared state in `beforeEach` (clean slate every test).
40
+ - Avoid global mocks; scope mocks to the test that needs them.
41
+ - Run with `--shuffle` in CI to catch new order dependencies.
42
+
43
+ ### Cause 3: Non-deterministic inputs
44
+
45
+ **Tells:**
46
+ - Test fails at month boundary, year boundary, DST change.
47
+ - Test fails based on hostname, locale, timezone.
48
+ - Test fails when a flaky RNG produces edge values.
49
+
50
+ **Common scenarios:**
51
+ - `new Date()` in production code, tested without clock fake.
52
+ - `Math.random()` for IDs, tested without seed.
53
+ - `Intl.DateTimeFormat` rendering based on system locale.
54
+ - File paths with timestamps, hash IDs based on time.
55
+
56
+ **Fix:**
57
+ - Mock the clock (`vi.useFakeTimers`, `freezegun`).
58
+ - Seed RNG explicitly in tests (`Math.random = () => 0.5` or via DI).
59
+ - Pin locale and timezone in CI environment AND in test setup.
60
+
61
+ ### Cause 4: External dependencies
62
+
63
+ **Tells:**
64
+ - Test fails when a third-party service has an outage.
65
+ - Test fails when CI runs against a real API and hits rate limits.
66
+ - Test fails differently for different geographic CI runners.
67
+
68
+ **Common scenarios:**
69
+ - Direct call to external API in unit tests.
70
+ - DNS lookup baked into test execution path.
71
+ - CDN-hosted resources in E2E tests.
72
+
73
+ **Fix:**
74
+ - Mock external services at unit/integration layers.
75
+ - Use contract tests instead of live calls.
76
+ - For E2E, use a sandbox account / dedicated test environment.
77
+
78
+ ## Quarantine workflow
79
+
80
+ When a test flakes:
81
+
82
+ ### Within 1 hour of detection
83
+
84
+ 1. **Quarantine the test.** Add `.skip` or equivalent. Add a comment:
85
+
86
+ ```javascript
87
+ test.skip('FLAKY-2026-05-12: race condition in checkout flow — owner: bruno, fix-by: 2026-05-19', () => {
88
+ // ...
89
+ });
90
+ ```
91
+
92
+ 2. **File a tracking issue.** Title: `FLAKY: <test name>`. Body includes:
93
+ - Test name and file
94
+ - Failure mode observed (race? order-dependency? non-determinism?)
95
+ - First detection: CI run URL, timestamp
96
+ - Hypothesis (if any)
97
+ - Owner and fix-by date
98
+
99
+ 3. **Note in CI.** The next CI run shows "1 quarantined" — make this visible on the dashboard.
100
+
101
+ ### Within 24 hours
102
+
103
+ 1. **Named owner assigned.** Not "team X" — a person.
104
+ 2. **Fix-by date set.** Default 5 business days. Major flake (production-path test): 2 days.
105
+
106
+ ### When fix-by passes without fix
107
+
108
+ Escalate:
109
+ - Pair the owner with someone for a debug session.
110
+ - If still unfixed after 2× the fix-by window, the test is removed (not skipped). A failing un-skipped test is better than a perpetually skipped test.
111
+
112
+ ## SLOs
113
+
114
+ ### `flaky_rate` (first-class metric)
115
+
116
+ - Definition: `(tests that pass on retry but fail on first run) / (total test runs)`.
117
+ - Target: < 1–2% per week.
118
+ - Alert at: > 5% on any given day.
119
+
120
+ ### `time-to-fix-flaky`
121
+
122
+ - Definition: hours from quarantine to fix-merged.
123
+ - Target: median < 24 hours; p95 < 7 days.
124
+
125
+ ### `quarantine inventory`
126
+
127
+ - Definition: count of currently-skipped tests with `FLAKY-*` markers.
128
+ - Target: < 10 at any time.
129
+ - Alert at: > 25 (the quarantine has become a cemetery — emergency cleanup).
130
+
131
+ ## What NOT to do
132
+
133
+ - **Auto-retry as fix.** `retries: 3` in CI config is hiding flakes, not fixing them. The 4th run that finally passes still validated nothing.
134
+ - **Increase timeouts indefinitely.** A timeout that grows from 5s to 30s "to make CI pass" means the test isn't waiting on the right condition.
135
+ - **Remove the test without investigation.** "It's been flaky forever, delete it" — sometimes correct, but make sure the underlying invariant is captured elsewhere.
136
+ - **Mark skip without owner.** A skip is a debt. An unowned debt is a perpetual liability.
137
+
138
+ ## When a test should be permanently removed
139
+
140
+ A flaky test should be DELETED (not just skipped) when:
141
+
142
+ 1. The invariant it tests is covered elsewhere (duplicate per A23).
143
+ 2. The invariant it tests is no longer a real requirement.
144
+ 3. The test was always probabilistic by design and never had value.
145
+
146
+ Deletion is acceptable; abandonment-by-skip is not.
147
+
148
+ ## Real-systems-at-final-gate principle
149
+
150
+ Many flakes come from mocks drifting from reality. The defense:
151
+
152
+ - **Unit:** mock the world; fast feedback; flake budget tiny here.
153
+ - **Integration:** real DB (testcontainers); mock external services with contract validation.
154
+ - **Contract:** Pact / schemathesis verifying producer-consumer agreement.
155
+ - **E2E:** real services in a preview environment; near-zero mocks.
156
+
157
+ When CI is wired this way, a flake at unit usually = race or order-dependency (fixable). A flake at E2E usually = real environment issue (fix the environment, not the test).
158
+
159
+ ## Integration with dev-workflow
160
+
161
+ - `/dw-fix-qa` uses this taxonomy when retest cycles produce inconsistent results: classify the flake, apply the right fix, document.
162
+ - `/dw-code-review` flags tests being modified that have a `FLAKY-*` marker — review must verify the flake is now actually fixed, not just made less likely.
163
+ - `/dw-run-qa` weekly summary includes the `flaky_rate` metric.
@@ -0,0 +1,241 @@
1
+ # Twelve positive patterns — pseudo-code that survives refactors
2
+
3
+ Each pattern survives across testing frameworks (Jest, Vitest, Playwright, Cypress, pytest, JUnit). Pseudo-code in JavaScript-flavored examples; translate to your stack.
4
+
5
+ ## 1. Query by behavior and accessible role, not CSS selectors
6
+
7
+ **Pattern:**
8
+
9
+ ```javascript
10
+ // GOOD — describes what the user does
11
+ const submitBtn = await page.getByRole('button', { name: 'Submit order' });
12
+ await submitBtn.click();
13
+ expect(await page.getByText(/order #\d+ confirmed/i)).toBeVisible();
14
+ ```
15
+
16
+ **Anti-pattern:**
17
+
18
+ ```javascript
19
+ // BAD — describes implementation
20
+ await page.click('.btn-primary.submit-btn');
21
+ expect(page.querySelector('.confirmation-toast')).toBeTruthy();
22
+ ```
23
+
24
+ The good version survives a CSS refactor, a className rename, a Tailwind migration. The bad version breaks on each.
25
+
26
+ ## 2. Selector hierarchy
27
+
28
+ When the role isn't enough, climb DOWN this ladder. Stop at the highest rung that disambiguates:
29
+
30
+ 1. **Role + name** — `getByRole('button', { name: 'Submit' })`
31
+ 2. **Label / form association** — `getByLabel('Email')`
32
+ 3. **Text content** — `getByText('Welcome back')`
33
+ 4. **Test-id** — `getByTestId('user-menu')` (escape hatch; do not start here)
34
+ 5. **Structural / CSS** — `querySelector('article:nth-child(3)')` (last resort; flags)
35
+
36
+ Test-id is fine. Test-id as your default is a sign you're not designing accessible UI.
37
+
38
+ ## 3. Wait on observable conditions, never wall-clock
39
+
40
+ **Pattern:**
41
+
42
+ ```javascript
43
+ // GOOD — waits for the actual condition
44
+ await expect(page.getByText('Order confirmed')).toBeVisible({ timeout: 5000 });
45
+ ```
46
+
47
+ **Anti-pattern:**
48
+
49
+ ```javascript
50
+ // BAD — wall-clock hopes
51
+ await page.waitForTimeout(3000);
52
+ expect(page.getByText('Order confirmed')).toBeVisible();
53
+ ```
54
+
55
+ `waitForTimeout` is the #1 source of flakiness. It races with the network, with rendering, with the event loop. Always wait on what you actually need to see.
56
+
57
+ ## 4. Each test independent and order-free
58
+
59
+ Every test runs cleanly when invoked in isolation (with `.only`) or in a randomized order.
60
+
61
+ **Pattern:**
62
+
63
+ ```javascript
64
+ beforeEach(async () => {
65
+ // Set up state THIS test needs
66
+ await db.users.create({ id: 'test-1', email: 'a@b.c' });
67
+ });
68
+
69
+ afterEach(async () => {
70
+ // Clean up — but if tests are independent, you may not need this
71
+ await db.users.deleteAll();
72
+ });
73
+ ```
74
+
75
+ **Anti-pattern:** Shared state in `beforeAll`. Tests passing only when run in suite order. Brittle CI runs.
76
+
77
+ **Healthier:** prefer setup in `beforeEach` over teardown in `afterEach`. A test that sets up its own state from a clean baseline never wonders "did the previous test corrupt me?"
78
+
79
+ ## 5. One behavior per test, as many assertions as that behavior needs
80
+
81
+ **Pattern:**
82
+
83
+ ```javascript
84
+ test('successful login redirects to dashboard and shows user name', async () => {
85
+ await login('user@example.com', 'password');
86
+
87
+ expect(await page.url()).toContain('/dashboard');
88
+ expect(await page.getByRole('heading', { name: /welcome/i })).toBeVisible();
89
+ expect(await page.getByText('user@example.com')).toBeVisible();
90
+ });
91
+ ```
92
+
93
+ ONE behavior ("successful login leads to dashboard with user info"). THREE assertions because that behavior has three observable parts.
94
+
95
+ **Anti-pattern:** Two unrelated behaviors crammed into one test ("login + then-also-test-search-works"). When it fails, you don't know which broke.
96
+
97
+ ## 6. Names read as specifications
98
+
99
+ **Pattern:**
100
+
101
+ ```
102
+ should reject invalid email when registering given no prior account
103
+ should approve refund within SLA given amount under threshold
104
+ should sync calendar events when user reconnects given offline edits
105
+ ```
106
+
107
+ Form: `should <expected outcome> when <triggering condition> [given <starting state>]`
108
+
109
+ **Anti-pattern:**
110
+
111
+ ```
112
+ test_login
113
+ test_email_1
114
+ testHappy
115
+ ```
116
+
117
+ These tell you nothing on failure. A spec-style name doubles as documentation when failure messages get noisy.
118
+
119
+ ## 7. Table-driven / parameterized when inputs vary
120
+
121
+ **Pattern:**
122
+
123
+ ```javascript
124
+ describe.each([
125
+ { input: '', expected: 'required' },
126
+ { input: 'a', expected: 'too short' },
127
+ { input: 'a'.repeat(100), expected: 'too long' },
128
+ { input: 'a@b.c', expected: 'valid' },
129
+ ])('validateEmail($input)', ({ input, expected }) => {
130
+ test(`returns ${expected}`, () => {
131
+ expect(validateEmail(input)).toBe(expected);
132
+ });
133
+ });
134
+ ```
135
+
136
+ Easy to add cases; impossible to forget one input class.
137
+
138
+ **Anti-pattern:** Five copy-pasted tests with one variable changed. Drift between them; one gets updated, others don't.
139
+
140
+ ## 8. Build test data via factories
141
+
142
+ **Pattern:**
143
+
144
+ ```javascript
145
+ const buildUser = (overrides = {}) => ({
146
+ id: 'u-' + Math.random().toString(36).slice(2),
147
+ email: 'test@example.com',
148
+ role: 'member',
149
+ createdAt: new Date(),
150
+ ...overrides,
151
+ });
152
+
153
+ test('admin can delete users', () => {
154
+ const admin = buildUser({ role: 'admin' });
155
+ const target = buildUser();
156
+ expect(canDelete(admin, target)).toBe(true);
157
+ });
158
+ ```
159
+
160
+ Tests focus on the FIELDS that matter to the behavior. Default everything else.
161
+
162
+ **Anti-pattern:** Literal 50-field JSON blobs copy-pasted across tests. When the schema changes, you update them all — or worse, miss some.
163
+
164
+ ## 9. Mock at boundaries you don't control
165
+
166
+ **Pattern:**
167
+
168
+ | Boundary | Test treatment |
169
+ |----------|---------------|
170
+ | Your own modules | Real (don't mock) |
171
+ | Your own DB (with testcontainers) | Real |
172
+ | Third-party HTTP API | Mock at fetch/axios level |
173
+ | Cloud SDK (AWS, GCP, Stripe) | Mock at SDK level OR sandbox account |
174
+ | System clock | Mock when test depends on time |
175
+ | RNG | Mock when test depends on randomness |
176
+ | File system (when external) | Mock; in tests of fs logic, real temp dir |
177
+
178
+ **Anti-pattern:** Mocking your own modules so the test is fast. You're now testing the mock setup, not the code.
179
+
180
+ ## 10. Real systems gate final merge
181
+
182
+ **Pattern:**
183
+
184
+ ```
185
+ unit (mocks ok) → every commit, run locally and in CI
186
+ integration (real DB) → every PR, run in CI
187
+ contract (boundary) → every PR
188
+ E2E (real services) → before merge to main, run in CI on preview env
189
+ ```
190
+
191
+ No merge to main without a real-system path going green. Mocks are speed, real is truth.
192
+
193
+ **Anti-pattern:** 100% mocked test suite. "It all passes locally" → first user request fails because the mock didn't match the real API shape.
194
+
195
+ ## 11. Mutation score over coverage percentage
196
+
197
+ **Pattern:**
198
+
199
+ Set up mutation testing (Stryker for JS/TS, mutmut for Python, etc.) ONCE per project. Run weekly on critical modules.
200
+
201
+ ```bash
202
+ npx stryker run
203
+ # Output: 87 mutants, 78 killed, 9 survived → mutation score 89.6%
204
+ ```
205
+
206
+ A surviving mutant means: this code path runs in tests, but the tests don't actually assert anything that breaks when the code changes. Investigate each.
207
+
208
+ **Anti-pattern:** 95% line coverage with assertions like `expect(result).toBeTruthy()` — every line ran, but mutations all survive. The suite is decorative.
209
+
210
+ ## 12. Page Object Model is a tool, not a religion
211
+
212
+ For small E2E suites (<20 tests, <5 pages), POM is over-engineering — direct queries are clearer.
213
+
214
+ For large suites (>50 tests, many pages, multi-step flows), POM pays off:
215
+
216
+ ```javascript
217
+ class CheckoutPage {
218
+ constructor(page) { this.page = page; }
219
+
220
+ async fillShipping(addr) { /* ... */ }
221
+ async selectPayment(method) { /* ... */ }
222
+ async place() { /* ... */ }
223
+ async waitForConfirmation() { return this.page.getByText(/confirmed/i); }
224
+ }
225
+
226
+ test('checkout end-to-end', async ({ page }) => {
227
+ const checkout = new CheckoutPage(page);
228
+ await checkout.fillShipping(defaultAddress);
229
+ await checkout.selectPayment('card');
230
+ await checkout.place();
231
+ await expect(checkout.waitForConfirmation()).toBeVisible();
232
+ });
233
+ ```
234
+
235
+ The page object hides the selector mess; tests read like specs. But for a 5-test suite, that wrapper is just noise.
236
+
237
+ **Rule of thumb:** apply POM when you have ≥3 tests sharing the same page interactions. Otherwise inline.
238
+
239
+ ## Applying these patterns
240
+
241
+ When `/dw-run-task` generates a new test, it must comply with these 12. `/dw-code-review` checks each diff hunk under test paths against these patterns and the anti-patterns in the sibling reference. Violations become findings.
@@ -0,0 +1,282 @@
1
+ # Playwright recipes — concrete tactical patterns
2
+
3
+ Practical Playwright code for the common scenarios. Use this when `/dw-run-qa` runs in UI mode or when `/dw-functional-doc` needs E2E coverage.
4
+
5
+ > These recipes ASSUME the doctrine in this skill (core rules, positive patterns) has already been applied. Recipes are the HOW once the WHY is settled.
6
+
7
+ ## Basic Navigation
8
+
9
+ ```javascript
10
+ import { test, expect } from '@playwright/test';
11
+
12
+ test('homepage loads and shows hero', async ({ page }) => {
13
+ await page.goto('http://localhost:3000');
14
+ await expect(page).toHaveTitle(/Welcome/);
15
+ await expect(page.getByRole('heading', { level: 1 })).toBeVisible();
16
+ });
17
+ ```
18
+
19
+ Key practices:
20
+ - Use `expect(page).toHaveTitle(/pattern/)` instead of `await page.title()` + manual assertion (waits for title to settle).
21
+ - Use `getByRole`, not selectors (Positive Pattern #1).
22
+
23
+ ## Form interaction
24
+
25
+ ```javascript
26
+ test('login redirects to dashboard with user info', async ({ page }) => {
27
+ await page.goto('/login');
28
+
29
+ // Use labels, not selectors
30
+ await page.getByLabel('Email').fill('user@example.com');
31
+ await page.getByLabel('Password').fill('secret123');
32
+ await page.getByRole('button', { name: 'Sign in' }).click();
33
+
34
+ // Wait on observable outcome, not wall-clock
35
+ await expect(page).toHaveURL(/\/dashboard/);
36
+ await expect(page.getByText('user@example.com')).toBeVisible();
37
+ });
38
+ ```
39
+
40
+ ## Screenshot capture (debugging + visual evidence)
41
+
42
+ ```javascript
43
+ // For debugging
44
+ test('checkout flow', async ({ page }) => {
45
+ try {
46
+ await page.goto('/checkout');
47
+ // ... interaction
48
+ } catch (err) {
49
+ await page.screenshot({ path: `failure-${Date.now()}.png`, fullPage: true });
50
+ throw err;
51
+ }
52
+ });
53
+
54
+ // For visual regression (only on stable surfaces)
55
+ test('header looks correct', async ({ page }) => {
56
+ await page.goto('/');
57
+ await expect(page.getByRole('banner')).toHaveScreenshot('header.png');
58
+ });
59
+ ```
60
+
61
+ Warning: visual regression tests need a baseline. Apply the snapshot classification gate from `ai-agent-gates.md` (Gate 5).
62
+
63
+ ## Browser console logs (capture for debug)
64
+
65
+ ```javascript
66
+ test('no console errors during checkout', async ({ page }) => {
67
+ const errors = [];
68
+ page.on('console', msg => {
69
+ if (msg.type() === 'error') errors.push(msg.text());
70
+ });
71
+ page.on('pageerror', err => errors.push(err.message));
72
+
73
+ await page.goto('/checkout');
74
+ await page.getByRole('button', { name: 'Complete order' }).click();
75
+ await expect(page.getByText('Order placed')).toBeVisible();
76
+
77
+ expect(errors).toEqual([]);
78
+ });
79
+ ```
80
+
81
+ ## Network interception (mock or inspect)
82
+
83
+ ```javascript
84
+ test('handles API failure gracefully', async ({ page }) => {
85
+ // Intercept the orders API and return 500
86
+ await page.route('**/api/orders', route =>
87
+ route.fulfill({ status: 500, body: 'Server error' })
88
+ );
89
+
90
+ await page.goto('/orders');
91
+
92
+ // The UI should show a recoverable error state, not crash
93
+ await expect(page.getByText('Could not load orders')).toBeVisible();
94
+ await expect(page.getByRole('button', { name: 'Retry' })).toBeVisible();
95
+ });
96
+ ```
97
+
98
+ ```javascript
99
+ test('order list calls API with correct params', async ({ page }) => {
100
+ let capturedUrl;
101
+ page.on('request', req => {
102
+ if (req.url().includes('/api/orders')) capturedUrl = req.url();
103
+ });
104
+
105
+ await page.goto('/orders?filter=overdue');
106
+
107
+ await expect.poll(() => capturedUrl).toContain('filter=overdue');
108
+ });
109
+ ```
110
+
111
+ ## Wait for conditions (NOT wall-clock)
112
+
113
+ ```javascript
114
+ // GOOD — wait on observable condition
115
+ await expect(page.getByText(/order #\d+ confirmed/i)).toBeVisible({ timeout: 10000 });
116
+
117
+ // GOOD — wait on URL change
118
+ await page.waitForURL(/\/dashboard/);
119
+
120
+ // GOOD — wait on network response
121
+ const responsePromise = page.waitForResponse(resp => resp.url().includes('/api/orders') && resp.status() === 200);
122
+ await page.getByRole('button', { name: 'Load orders' }).click();
123
+ await responsePromise;
124
+
125
+ // GOOD — poll a custom condition
126
+ await expect.poll(async () => {
127
+ const count = await page.getByRole('listitem').count();
128
+ return count;
129
+ }, { timeout: 5000 }).toBeGreaterThan(0);
130
+
131
+ // BAD — wall-clock
132
+ await page.waitForTimeout(3000); // ← Anti-pattern A7
133
+ ```
134
+
135
+ ## Mobile viewport testing
136
+
137
+ ```javascript
138
+ test('mobile menu works', async ({ browser }) => {
139
+ const context = await browser.newContext({
140
+ viewport: { width: 375, height: 812 }, // iPhone X
141
+ userAgent: 'Mozilla/5.0 (iPhone; ...)',
142
+ });
143
+ const page = await context.newPage();
144
+
145
+ await page.goto('/');
146
+ await page.getByRole('button', { name: /menu/i }).click();
147
+ await expect(page.getByRole('navigation')).toBeVisible();
148
+ });
149
+ ```
150
+
151
+ For dev-workflow's redesign-ui flow, capture screenshots at TWO viewports: 375px (mobile) and 1440px (desktop). This is documented in `/dw-redesign-ui` step 7.
152
+
153
+ ## Multi-step user journey (Page Object Model)
154
+
155
+ For >3 tests sharing the same flow, POM pays off:
156
+
157
+ ```javascript
158
+ // checkout-page.ts
159
+ export class CheckoutPage {
160
+ constructor(private page) {}
161
+
162
+ async goto() {
163
+ await this.page.goto('/checkout');
164
+ }
165
+
166
+ async fillShipping(addr) {
167
+ await this.page.getByLabel('Address').fill(addr.street);
168
+ await this.page.getByLabel('City').fill(addr.city);
169
+ await this.page.getByLabel('ZIP').fill(addr.zip);
170
+ }
171
+
172
+ async selectPayment(method) {
173
+ await this.page.getByLabel(method).check();
174
+ }
175
+
176
+ async place() {
177
+ await this.page.getByRole('button', { name: 'Place order' }).click();
178
+ }
179
+
180
+ async expectConfirmed() {
181
+ await expect(this.page.getByText(/order #\d+ confirmed/i)).toBeVisible();
182
+ }
183
+ }
184
+
185
+ // checkout.spec.ts
186
+ test('checkout end-to-end', async ({ page }) => {
187
+ const checkout = new CheckoutPage(page);
188
+ await checkout.goto();
189
+ await checkout.fillShipping(defaultAddress);
190
+ await checkout.selectPayment('Card ending in 4242');
191
+ await checkout.place();
192
+ await checkout.expectConfirmed();
193
+ });
194
+ ```
195
+
196
+ ## Helper utilities
197
+
198
+ For small projects, a couple of helpers reduce boilerplate. Drop these in `test-helpers.ts`:
199
+
200
+ ```typescript
201
+ import type { Page } from '@playwright/test';
202
+
203
+ /**
204
+ * Capture browser console logs as the page runs.
205
+ * Pass the returned array to assert on logs at the end of the test.
206
+ */
207
+ export function captureConsoleLogs(page: Page) {
208
+ const logs: Array<{ type: string; text: string; ts: string }> = [];
209
+ page.on('console', msg => {
210
+ logs.push({ type: msg.type(), text: msg.text(), ts: new Date().toISOString() });
211
+ });
212
+ return logs;
213
+ }
214
+
215
+ /**
216
+ * Take a timestamped screenshot for debugging or evidence collection.
217
+ */
218
+ export async function captureScreenshot(page: Page, name: string) {
219
+ const stamp = new Date().toISOString().replace(/[:.]/g, '-');
220
+ const filename = `${name}-${stamp}.png`;
221
+ await page.screenshot({ path: filename, fullPage: true });
222
+ return filename;
223
+ }
224
+ ```
225
+
226
+ ## Persistent session (auth state)
227
+
228
+ When tests need to start logged in, save auth state ONCE in setup:
229
+
230
+ ```typescript
231
+ // global-setup.ts
232
+ import { chromium } from '@playwright/test';
233
+
234
+ export default async () => {
235
+ const browser = await chromium.launch();
236
+ const page = await browser.newPage();
237
+
238
+ await page.goto('http://localhost:3000/login');
239
+ await page.getByLabel('Email').fill('e2e-user@example.com');
240
+ await page.getByLabel('Password').fill(process.env.E2E_PASSWORD);
241
+ await page.getByRole('button', { name: 'Sign in' }).click();
242
+ await page.waitForURL(/\/dashboard/);
243
+
244
+ await page.context().storageState({ path: '.auth/user.json' });
245
+ await browser.close();
246
+ };
247
+
248
+ // playwright.config.ts
249
+ export default defineConfig({
250
+ globalSetup: require.resolve('./global-setup'),
251
+ use: {
252
+ storageState: '.auth/user.json',
253
+ },
254
+ });
255
+ ```
256
+
257
+ Tests now start authenticated. No login loop in every test.
258
+
259
+ ## Common pitfalls (cross-reference to anti-patterns)
260
+
261
+ | Pitfall | Anti-pattern | Fix |
262
+ |---------|--------------|-----|
263
+ | `await page.click('.btn-primary')` | A1 (implementation selectors) | `getByRole('button', { name: ... })` |
264
+ | `await page.waitForTimeout(3000)` | A7 (static sleeps) | Wait on observable condition |
265
+ | `expect(button).toBeTruthy()` | A5 (vague existence) | Assert what you actually want |
266
+ | Tests pass solo, fail in suite | A8 (order dependency) | Setup in beforeEach, not beforeAll |
267
+ | `await page.goto('https://www.google.com')` | A20 (third-party site) | Mock or skip |
268
+
269
+ ## Limitations
270
+
271
+ - Playwright cannot test native mobile apps (use React Native Testing Library or Detox).
272
+ - Some authentication flows (Google Sign-In, MFA hardware keys) cannot be automated; use test-mode bypasses with a dedicated test account.
273
+ - Visual regression tests are sensitive to font rendering across OSes; pin to a CI runner OS.
274
+
275
+ ## Cross-skill integration
276
+
277
+ When running these recipes, the doctrine in this skill applies:
278
+ - Apply core rules (especially Rule 2: lowest layer first — many E2E tests should be integration tests instead).
279
+ - Run the 7 AI Agent Gates if a coding agent is producing this test.
280
+ - Check for Anti-Patterns 1, 5, 7, 20 in the diff.
281
+ - For browser security scenarios (auth bypass, XSS, CSRF), see `security-boundary.md`.
282
+ - For picking which workflow (UI / network / perf) applies, see `three-workflow-patterns.md`.
@@ -1,6 +1,6 @@
1
1
  # Security boundary — every browser is hostile
2
2
 
3
- > Adapted from [`addyosmani/agent-skills/browser-devtools`](https://github.com/addyosmani/agent-skills/tree/main/browser-devtools) (MIT). Adopts the security-boundary principle to inform how webapp-testing scenarios should validate trust boundaries.
3
+ > Adapted from [`addyosmani/agent-skills/browser-devtools`](https://github.com/addyosmani/agent-skills/tree/main/browser-devtools) (MIT). Adopts the security-boundary principle to inform how browser-based testing scenarios validate trust boundaries.
4
4
 
5
5
  The browser is not a secure environment. Anything that runs there — JS, CSS, HTML, devtools — is under the user's control. When you write a webapp test, you're not just verifying functionality; you're often verifying that this assumption holds.
6
6