@seanyao/roll 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/README.md +201 -0
  2. package/bin/roll +1375 -0
  3. package/conventions/config.yaml +15 -0
  4. package/conventions/global/.cursor-rules +31 -0
  5. package/conventions/global/AGENTS.md +100 -0
  6. package/conventions/global/CLAUDE.md +32 -0
  7. package/conventions/global/GEMINI.md +28 -0
  8. package/conventions/templates/backend-service/.cursor-rules +17 -0
  9. package/conventions/templates/backend-service/AGENTS.md +88 -0
  10. package/conventions/templates/backend-service/CLAUDE.md +18 -0
  11. package/conventions/templates/backend-service/GEMINI.md +16 -0
  12. package/conventions/templates/cli/.cursor-rules +17 -0
  13. package/conventions/templates/cli/AGENTS.md +66 -0
  14. package/conventions/templates/cli/CLAUDE.md +18 -0
  15. package/conventions/templates/cli/GEMINI.md +16 -0
  16. package/conventions/templates/frontend-only/.cursor-rules +16 -0
  17. package/conventions/templates/frontend-only/AGENTS.md +71 -0
  18. package/conventions/templates/frontend-only/CLAUDE.md +16 -0
  19. package/conventions/templates/frontend-only/GEMINI.md +14 -0
  20. package/conventions/templates/fullstack/.cursor-rules +17 -0
  21. package/conventions/templates/fullstack/AGENTS.md +87 -0
  22. package/conventions/templates/fullstack/CLAUDE.md +17 -0
  23. package/conventions/templates/fullstack/GEMINI.md +15 -0
  24. package/package.json +33 -0
  25. package/skills/roll-.changelog/SKILL.md +79 -0
  26. package/skills/roll-.clarify/SKILL.md +59 -0
  27. package/skills/roll-.echo/SKILL.md +113 -0
  28. package/skills/roll-.qa/SKILL.md +204 -0
  29. package/skills/roll-.review/SKILL.md +105 -0
  30. package/skills/roll-build/SKILL.md +559 -0
  31. package/skills/roll-debug/SKILL.md +428 -0
  32. package/skills/roll-design/ENGINEERING_CHECKLIST.md +256 -0
  33. package/skills/roll-design/SKILL.md +276 -0
  34. package/skills/roll-fix/SKILL.md +442 -0
  35. package/skills/roll-jot/SKILL.md +50 -0
  36. package/skills/roll-research/SKILL.md +307 -0
  37. package/skills/roll-research/references/schema.json +162 -0
  38. package/skills/roll-research/scripts/md_to_pdf.py +289 -0
  39. package/skills/roll-sentinel/SKILL.md +355 -0
  40. package/skills/roll-spar/SKILL.md +287 -0
  41. package/template/.env.example +47 -0
  42. package/template/.github/workflows/ci.yml +32 -0
  43. package/template/.github/workflows/sentinel.yml +26 -0
  44. package/template/AGENTS.md +80 -0
  45. package/template/BACKLOG.md +42 -0
  46. package/template/package.json +43 -0
  47. package/tools/roll-fetch/SKILL.md +182 -0
  48. package/tools/roll-fetch/package.json +15 -0
  49. package/tools/roll-fetch/smart-web-fetch.js +558 -0
  50. package/tools/roll-probe/SKILL.md +84 -0
@@ -0,0 +1,87 @@
1
+ # Project Conventions — Fullstack Web
2
+
3
+ > Project-type-specific conventions — reference material for skills.
4
+ > **Note: Reference Template** — used by skills to infer project conventions. Not selected by users.
5
+
6
+ ## Frontend
7
+
8
+ - Stack: React 18+ / TypeScript / Vite / Tailwind CSS / shadcn/ui / Lucide React
9
+ - Use shadcn/ui components first. Custom components only when shadcn doesn't cover it.
10
+ - `src/components/ui/` is shadcn-generated — never edit manually.
11
+ - `src/components/[feature]/` for custom feature components.
12
+ - Tailwind utility classes only. No inline styles, no CSS modules.
13
+ - Organize by domain: `src/domains/{domain}/components/`, `hooks/`, `services/`, `types.ts`
14
+ - Shared utilities in `src/shared/` (api/, types/, utils/, hooks/).
15
+
16
+ ## Backend
17
+
18
+ - RESTful API conventions. Consistent URL structure: `/api/{resource}/{id}`.
19
+ - Structured error responses: `{ error: string, code: string, details?: object }`.
20
+ - Environment-based config via `.env`. Never hardcode secrets.
21
+ - Folder structure:
22
+ - `src/routes/` or `api/routes/` — route handlers
23
+ - `src/services/` or `api/services/` — business logic
24
+ - `src/models/` or `api/models/` — data models and schemas
25
+ - Health check endpoint: `GET /api/health`
26
+ - Authentication: JWT in httpOnly cookies.
27
+
28
+ ## Architecture
29
+
30
+ - **Domain Driven Design**: organize code by business domain, not technical layer.
31
+ - **Clean Architecture**: UI → Application (hooks) → Domain (services) → Infrastructure (API/DB).
32
+ - **Decoupling**: UI renders only, logic lives in hooks. API calls wrapped in services.
33
+ - **Data Schema First**: define types/schemas before writing business logic.
34
+ - **Frontend-Backend Contract**: API changes must sync `shared/types/`. Errors use unified format.
35
+
36
+ ## Project Structure
37
+
38
+ ```
39
+ src/
40
+ ├── components/ui/ # shadcn/ui (generated, don't edit)
41
+ ├── domains/ # DDD by business domain
42
+ │ └── {domain}/
43
+ │ ├── components/ # domain-specific UI
44
+ │ ├── hooks/ # domain logic
45
+ │ ├── services/ # API calls
46
+ │ └── types.ts # domain types
47
+ ├── shared/
48
+ │ ├── api/ # HTTP client, interceptors
49
+ │ ├── types/ # shared type definitions
50
+ │ └── utils/ # utility functions
51
+ ├── App.tsx
52
+ └── main.tsx
53
+
54
+ api/
55
+ ├── routes/ # RESTful route handlers
56
+ ├── services/ # business logic
57
+ ├── models/ # data models
58
+ └── types.ts # API contract types
59
+
60
+ schema/ # data contract definitions
61
+ tests/
62
+ ├── unit/ # Vitest
63
+ ├── e2e/ # Playwright
64
+ └── regression/ # Sentinel regression
65
+ ```
66
+
67
+ ## Development Discipline
68
+
69
+ - **TCR mandatory**: All code changes follow Test → Green = Commit / Red = Revert. No WIP commits.
70
+ - **Action granularity**: Each Action independently deployable, completable in 2–5 min. No placeholders (no TBD/TODO/pending).
71
+ - **Verification Gate**: Before marking done, provide fresh evidence (test output, screenshot, curl). "I confirmed it works" is not evidence.
72
+ - **Complete delivery**: push to GitHub + CI passes + deployed online. Local-only done is not done.
73
+
74
+ ## Testing Requirements
75
+
76
+ - All business logic must have unit tests (coverage >80%).
77
+ - All API endpoints must have integration tests — no DB mocks, use real database.
78
+ - Critical user flows must have E2E tests (Playwright).
79
+ - New architecture introductions (State/Cache/EventBus) must have data flow integration tests.
80
+ - Sentinel will periodically regression-test completed Stories.
81
+
82
+ ## Workspace Structure
83
+
84
+ - `BACKLOG.md` = index table, one-line summary per story only.
85
+ - `docs/features/<feature>.md` = US details (AC, Files, Dependencies).
86
+ - `docs/features/<feature>-plan.md` = architecture design doc (optional).
87
+ - Never write project docs to `~/.kimi/` or any global config directory.
@@ -0,0 +1,17 @@
1
+ # Project Preferences — Fullstack Web (Claude Code)
2
+
3
+ > Extends global CLAUDE.md + project AGENTS.md.
4
+
5
+ ## Stack
6
+
7
+ - Frontend: React + shadcn/ui + Tailwind CSS + Vite
8
+ - Backend: Node.js API (Express/Hono/Fastify)
9
+ - Testing: Vitest (unit) + Playwright (E2E)
10
+ - Deploy: Vercel (frontend) + Railway/Fly.io (backend)
11
+
12
+ ## Claude Code Notes
13
+
14
+ - Use `$roll-design` to plan features that span frontend and backend.
15
+ - When modifying API contracts, update both `api/types.ts` and `src/shared/types/` in the same commit.
16
+ - Use worktree isolation for parallel frontend/backend Actions in `$roll-story-build`.
17
+ - Run `npm run build` to verify both frontend and backend compile before pushing.
@@ -0,0 +1,15 @@
1
+ # Project Preferences — Fullstack Web (Gemini CLI)
2
+
3
+ > Extends global GEMINI.md + project AGENTS.md.
4
+
5
+ ## Stack
6
+
7
+ - Frontend: React + shadcn/ui + Tailwind CSS + Vite
8
+ - Backend: Node.js API (Express/Hono/Fastify)
9
+ - Testing: Vitest (unit) + Playwright (E2E)
10
+
11
+ ## Gemini Notes
12
+
13
+ - When modifying API contracts, update both `api/types.ts` and `src/shared/types/` in the same commit.
14
+ - Run `npm run build` to verify both frontend and backend compile before pushing.
15
+ - Follow the project AGENTS.md for architecture constraints and Roll workflow.
package/package.json ADDED
@@ -0,0 +1,33 @@
1
+ {
2
+ "name": "@seanyao/roll",
3
+ "version": "0.5.0",
4
+ "description": "Roll — Roll out features with AI agents",
5
+ "scripts": {
6
+ "test": "find tests/unit tests/integration -name '*.bats' | sort | xargs ./tests/helpers/bats-core/bin/bats"
7
+ },
8
+ "keywords": [
9
+ "ai",
10
+ "agent",
11
+ "development-workflow",
12
+ "skills",
13
+ "agent-first"
14
+ ],
15
+ "homepage": "https://github.com/seanyao/roll",
16
+ "repository": {
17
+ "type": "git",
18
+ "url": "https://github.com/seanyao/roll.git"
19
+ },
20
+ "bin": {
21
+ "roll": "./bin/roll"
22
+ },
23
+ "license": "MIT",
24
+ "author": "Sean Yao <seanyao@gmail.com>",
25
+ "files": [
26
+ "bin/",
27
+ "conventions/",
28
+ "skills/",
29
+ "tools/",
30
+ "template/",
31
+ "README.md"
32
+ ]
33
+ }
@@ -0,0 +1,79 @@
1
+ ---
2
+ hidden: true
3
+ name: roll-.changelog
4
+ description: After build completion, extracts completed Stories from BACKLOG.md to generate CHANGELOG.md. Auto-triggered after successful deploy, keeping the external changelog in sync with the internal backlog.
5
+ ---
6
+
7
+ # WK Generate Changelog
8
+
9
+ After successful Build & Deploy, extracts completed Stories from BACKLOG.md to generate a user-friendly `CHANGELOG.md`.
10
+
11
+ ## When Triggered
12
+
13
+ - **Auto-triggered**: After successful deploy of `$roll-story` or `$roll-fix`
14
+ - **Manual trigger**: When user requests "update changelog" or "generate release notes"
15
+
16
+ ## Workflow
17
+
18
+ ### 1. Read BACKLOG.md
19
+
20
+ ```
21
+ Read BACKLOG.md from the project root directory.
22
+ Extract Stories with status ✅ Completed / Done.
23
+ ```
24
+
25
+ ### 2. Filter for External Content
26
+
27
+ **Remove internal information:**
28
+ - Progress tables, completion percentages
29
+ - "As a / I can / So that" format
30
+ - Detailed AC checklists
31
+ - Technical debt, internal file paths
32
+ - Test case counts, architecture diagrams
33
+
34
+ **Keep user-facing value:**
35
+ - New features (one-sentence description)
36
+ - Bug fixes (user-visible impact)
37
+ - UX improvements (layout, interaction enhancements)
38
+ - Performance/reliability improvements
39
+
40
+ ### 3. Version Number Format
41
+
42
+ ```
43
+ YYYY.MM.DD
44
+ YYYY.MM.DD-1 (multiple releases on the same day)
45
+ YYYY.MM.DD-2
46
+ ```
47
+
48
+ ### 4. Generate CHANGELOG.md
49
+
50
+ ```markdown
51
+ # Changelog
52
+
53
+ ## 2026.04.03
54
+ - **Added**: <completed feature extracted from BACKLOG>
55
+ - **Fixed**: <resolved bug>
56
+ - **Improved**: <UX/performance optimization>
57
+
58
+ ## 2026.04.01
59
+ - ...
60
+ ```
61
+
62
+ **Ordering**: Most recent version first (reverse chronological)
63
+
64
+ ### 5. Commit Update
65
+
66
+ ```bash
67
+ git add CHANGELOG.md
68
+ git commit -m "docs: update changelog for release $(date +%Y.%m.%d)"
69
+ git push
70
+ ```
71
+
72
+ ## Integration
73
+
74
+ After successful deploy in `$roll-story` / `$roll-fix` / `$roll-fly`:
75
+
76
+ ```markdown
77
+ **Post-Deploy:**
78
+ - `$roll-.changelog` - Sync external changelog
79
+ ```
@@ -0,0 +1,59 @@
1
+ ---
2
+ hidden: true
3
+ name: roll-.clarify
4
+ description: |
5
+ Passive scope-clarification skill. Auto-triggers when roll-build receives vague or under-specified input in Fly mode. Summarizes intent and asks 3–5 targeted questions to establish boundaries before planning or coding.
6
+ This is a passive skill. Never announce "I'm using roll-.clarify." Just do it naturally: summarize, ask, wait.
7
+ ---
8
+
9
+ # roll-.clarify
10
+
11
+ > Understand first, build second.
12
+
13
+ ## Trigger
14
+
15
+ Auto-invoked by `roll-build` (Fly mode) when the user input is:
16
+ - A single vague sentence
17
+ - Missing clear boundaries (what/who/when/where)
18
+ - Contains ambiguous terms like "优化一下", "改一下", "加个东西"
19
+ - Could be interpreted in multiple ways
20
+
21
+ **Do NOT activate when:**
22
+ - Intent is already clear and actionable
23
+ - User gives a specific command with a skill trigger (e.g. `$roll-jot ...`)
24
+ - User is answering a clarification question you just asked
25
+ - The task is simple enough that misinterpretation risk is negligible
26
+
27
+ ## Behavior
28
+
29
+ 1. **Summarize** the user's intent in 1–2 sentences.
30
+ 2. **Assess complexity** (small / medium / large).
31
+ 3. **Ask 3–5 targeted questions** to fill the biggest gaps. Focus on:
32
+ - Scope: what exactly is in / out?
33
+ - User: who is the actor?
34
+ - Data: what changes or persists?
35
+ - Edge cases: what could go wrong?
36
+ - Verification: how will we know it's done?
37
+
38
+ ## Output format
39
+
40
+ ```
41
+ 🎯 Clarified Intent: {1-2 sentences}
42
+
43
+ 📏 Complexity: {small|medium|large}
44
+
45
+ ❓ Open Questions:
46
+ 1. {question 1}
47
+ 2. {question 2}
48
+ 3. {question 3}
49
+ ...
50
+
51
+ ➡️ Please answer the questions above and I'll proceed to planning / building.
52
+ ```
53
+
54
+ ## Rules
55
+
56
+ - Do **not** start coding or planning until the user replies.
57
+ - Questions must be concrete and answerable in one sentence each.
58
+ - If the input is already clear enough, silently return and let `roll-build` continue.
59
+ - Never announce "I'm using roll-.clarify."
@@ -0,0 +1,113 @@
1
+ ---
2
+ hidden: true
3
+ name: roll-.echo
4
+ description: |
5
+ Passive intent clarification skill. Automatically activates when user input is vague, rambling, contradictory, or unclear. Restates the user's intent in structured, concise form and confirms before proceeding. Does NOT activate when intent is already clear — in that case, just execute directly.
6
+ This is a passive skill. Never announce "I'm using roll-.echo." Just do it naturally: restate, confirm, proceed.
7
+ ---
8
+
9
+ # Echo
10
+
11
+ > Passive intent clarification — restate messy thoughts as clear intent, confirm, then act.
12
+
13
+ ## When to Activate
14
+
15
+ This skill fires **automatically** when the AI detects unclear intent. It should feel natural — not like a skill invocation, but like a thoughtful colleague saying "let me make sure I got that right."
16
+
17
+ **Activation signals** (any of these):
18
+ - Input is long and rambling (>3 sentences without a clear ask)
19
+ - Multiple ideas tangled together with no clear priority
20
+ - Contradictory statements ("I want X but also not-X")
21
+ - Hedging language: "maybe", "sort of", "I think", "not sure but", "something like"
22
+ - Vague scope: "make it better", "fix this area", "do something about"
23
+ - The intent could reasonably be interpreted in 2+ very different ways
24
+
25
+ **Do NOT activate when**:
26
+ - Intent is already clear and actionable ("add a login button to the header")
27
+ - User gives a specific command with a skill trigger ("$roll-story US-001")
28
+ - User is answering a question you asked (they're clarifying, not initiating)
29
+ - The task is simple enough that misinterpretation risk is negligible
30
+
31
+ **When in doubt**: If you're 80%+ confident you understand correctly, just execute. Echo is for the 50/50 situations where getting it wrong would waste real effort.
32
+
33
+ ## How to Echo
34
+
35
+ ### Step 1: Listen completely
36
+
37
+ Let the user finish. Don't interrupt a stream of consciousness — the clearest signal often comes at the end.
38
+
39
+ ### Step 2: Distill
40
+
41
+ Extract the core intent from the noise. Structure it as:
42
+
43
+ ```
44
+ What I'm hearing:
45
+ - **Goal**: {what they want to achieve — one sentence}
46
+ - **Scope**: {what's in / what's out}
47
+ - **Priority**: {if multiple things mentioned, which comes first}
48
+ - **Constraints**: {any specific requirements or limitations mentioned}
49
+ ```
50
+
51
+ Only include fields that are relevant. A simple clarification might just be:
52
+
53
+ ```
54
+ So you want: {one clear sentence restating the intent}
55
+ ```
56
+
57
+ ### Step 3: Surface ambiguity (if any)
58
+
59
+ If there are genuine decision points the user needs to resolve:
60
+
61
+ ```
62
+ One thing I want to confirm: {the specific ambiguity, with 2-3 concrete options}
63
+ ```
64
+
65
+ Don't manufacture ambiguity. If you understood fine, skip this.
66
+
67
+ ### Step 4: Confirm and proceed
68
+
69
+ End with a natural confirmation prompt, then act on the response:
70
+ - If confirmed → route to the appropriate skill or start working
71
+ - If corrected → update understanding, optionally echo again if the correction introduces new ambiguity
72
+
73
+ ## Style
74
+
75
+ - **Be brief.** The echo should be shorter than the original input, not longer.
76
+ - **Be concrete.** Replace vague language with specific interpretations. "Make it better" → "Improve the search response time" or "Redesign the search UI."
77
+ - **Be natural.** This is a conversation, not a form. Don't use the template rigidly.
78
+ - **No jargon.** Mirror the user's vocabulary, not technical framework terms.
79
+ - **No patronizing.** Don't echo things that are already obvious. The user will feel like you're not listening.
80
+
81
+ ## Anti-Patterns
82
+
83
+ - Echoing when intent is already clear (wastes the user's time)
84
+ - Echoing every single message (becomes annoying fast)
85
+ - Adding your own ideas to the echo (scope creep — echo is for understanding, not suggesting)
86
+ - Using the echo as a stall tactic when you should just do the work
87
+ - Turning a simple ask into a complex questionnaire
88
+
89
+ ## Examples
90
+
91
+ ### Good Echo (vague input)
92
+
93
+ **User**: "I've been thinking about the search, it's kind of slow and the results aren't great, maybe we need elasticsearch or something, and also the UI is weird on mobile, the filters don't work right, I don't know, just... it needs work"
94
+
95
+ **Echo**: "Got it — sounds like two separate issues: (1) search quality and performance (results aren't relevant + slow), and (2) mobile UI broken filters. Want me to tackle them as two separate stories, or is one more urgent?"
96
+
97
+ ### Good Echo (contradictory input)
98
+
99
+ **User**: "Make the dashboard simpler, but also add the analytics charts we discussed, and the export feature, oh and notifications"
100
+
101
+ **Echo**: "Those pull in opposite directions — simpler dashboard vs. adding 3 new features. Which do you mean: strip the current dashboard down and then selectively add these features? Or keep what's there and add the new stuff on top?"
102
+
103
+ ### No Echo Needed (clear input)
104
+
105
+ **User**: "Add a dark mode toggle to the settings page"
106
+
107
+ *→ Just do it. No echo needed.*
108
+
109
+ ### No Echo Needed (skill trigger)
110
+
111
+ **User**: "$roll-design search improvements"
112
+
113
+ *→ Route to roll-design. No echo needed.*
@@ -0,0 +1,204 @@
1
+ ---
2
+ hidden: true
3
+ name: roll-.qa-cover
4
+ description: QA coverage reference for build skills. Defines test pyramid (unit/E2E/visual/smoke), coverage requirements, and CI gates. Ensures quality assurance across all testing layers.
5
+ ---
6
+
7
+ # QA Cover
8
+
9
+ This is a **reference skill** used by `roll-story`, `roll-fix`, and `roll-fly` for quality assurance and test coverage.
10
+
11
+ ## When to Apply
12
+
13
+ Any product with a user interface (Web, Desktop, Mobile) must follow these testing standards.
14
+
15
+ ## Required Testing Levels
16
+
17
+ ### 1. Unit Tests (Logic)
18
+ - **Tool**: Vitest / Jest
19
+ - **Coverage**: Business logic, utilities, hooks
20
+ - **Run**: `npm run test`
21
+
22
+ ### 2. E2E Tests (User Flows)
23
+ - **Tool**: **Playwright** (default)
24
+ - **Coverage**: Critical user paths, interactions
25
+ - **Run**: `npm run test:e2e`
26
+
27
+ ### 3. Visual Regression (UI Stability)
28
+ - **Tool**: Playwright screenshot testing
29
+ - **Coverage**: Key UI states
30
+ - **Run**: Part of E2E tests
31
+ - **Baseline**: Stored in `e2e/__snapshots__/`
32
+
33
+ ### 4. Smoke Tests (Post-deploy)
34
+ - **Tool**: Playwright
35
+ - **Coverage**: Core functionality on production
36
+ - **Run**: `npm run test:e2e:smoke`
37
+
38
+ ## Playwright Setup
39
+
40
+ ### Installation
41
+ ```bash
42
+ npm install -D @playwright/test
43
+ npx playwright install chromium
44
+ ```
45
+
46
+ ### Configuration (playwright.config.ts)
47
+ ```typescript
48
+ import { defineConfig, devices } from '@playwright/test';
49
+
50
+ export default defineConfig({
51
+ testDir: './e2e',
52
+ fullyParallel: true,
53
+ retries: process.env.CI ? 2 : 0,
54
+ reporter: 'html',
55
+ use: {
56
+ baseURL: process.env.PLAYWRIGHT_BASE_URL || 'http://localhost:5173',
57
+ trace: 'on-first-retry',
58
+ screenshot: 'only-on-failure',
59
+ },
60
+ projects: [
61
+ { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
62
+ ],
63
+ });
64
+ ```
65
+
66
+ ### Required Test Files
67
+
68
+ **e2e/smoke.spec.ts** (Deployment verification)
69
+ ```typescript
70
+ import { test, expect } from '@playwright/test';
71
+
72
+ test('app loads', async ({ page }) => {
73
+ await page.goto('/');
74
+ await expect(page.locator('#app')).toBeVisible();
75
+ });
76
+ ```
77
+
78
+ **e2e/interaction.spec.ts** (User flows)
79
+ ```typescript
80
+ test('user can complete core flow', async ({ page }) => {
81
+ await page.goto('/');
82
+ // Test critical user journey
83
+ });
84
+ ```
85
+
86
+ **e2e/visual.spec.ts** (Visual regression)
87
+ ```typescript
88
+ test('homepage visual', async ({ page }) => {
89
+ await page.goto('/');
90
+ await expect(page).toHaveScreenshot('homepage.png');
91
+ });
92
+ ```
93
+
94
+ ### Package.json Scripts
95
+ ```json
96
+ {
97
+ "scripts": {
98
+ "test:e2e": "playwright test",
99
+ "test:e2e:ui": "playwright test --ui",
100
+ "test:e2e:smoke": "playwright test smoke.spec.ts",
101
+ "test:all": "npm run test && npm run test:e2e"
102
+ }
103
+ }
104
+ ```
105
+
106
+ ## Visual Regression Workflow
107
+
108
+ ### 1. Create Baseline (First Time)
109
+ ```bash
110
+ npx playwright test --update-snapshots
111
+ ```
112
+
113
+ ### 2. Commit Baseline
114
+ ```bash
115
+ git add e2e/__snapshots__/
116
+ git commit -m "chore: add visual regression baselines"
117
+ ```
118
+
119
+ ### 3. Subsequent Runs (Compare)
120
+ ```bash
121
+ npm run test:e2e
122
+ # Fails if screenshots differ beyond threshold
123
+ ```
124
+
125
+ ### 4. Update Baseline (Intentional UI Change)
126
+ ```bash
127
+ npx playwright test --update-snapshots
128
+ git add e2e/__snapshots__/
129
+ git commit -m "chore: update visual baseline for new design"
130
+ ```
131
+
132
+ ## CI/CD Integration
133
+
134
+ ### Local Pre-push Checklist
135
+ - [ ] `npm run test` passes
136
+ - [ ] `npm run test:e2e` passes
137
+ - [ ] No unexpected visual regressions
138
+
139
+ ### Post-deploy Smoke Test
140
+ ```bash
141
+ # Against production URL
142
+ PLAYWRIGHT_BASE_URL=https://your-app.com npm run test:e2e:smoke
143
+ ```
144
+
145
+ ## Common Patterns
146
+
147
+ ### Testing Canvas/Game Rendering
148
+ ```typescript
149
+ test('game renders', async ({ page }) => {
150
+ await page.goto('/');
151
+ const canvas = page.locator('#gameCanvas');
152
+ await expect(canvas).toBeVisible();
153
+
154
+ // Visual regression for canvas
155
+ await expect(page).toHaveScreenshot('game-initial.png', {
156
+ maxDiffPixels: 100,
157
+ });
158
+ });
159
+ ```
160
+
161
+ ### Testing Responsive Layouts
162
+ ```typescript
163
+ test('responsive design', async ({ page }) => {
164
+ await page.setViewportSize({ width: 375, height: 667 });
165
+ await page.goto('/');
166
+ await expect(page.locator('.mobile-menu')).toBeVisible();
167
+ });
168
+ ```
169
+
170
+ ### Testing Voice/Audio Features
171
+ ```typescript
172
+ test('voice button toggles', async ({ page }) => {
173
+ await page.goto('/');
174
+ const btn = page.locator('#voiceBtn');
175
+ await btn.click();
176
+ await expect(btn).toHaveClass(/active/);
177
+ });
178
+ ```
179
+
180
+ ## Failure Handling
181
+
182
+ ### Flaky Tests
183
+ - Add `test.fixme()` to skip temporarily
184
+ - Increase `timeout` for slow operations
185
+ - Use `retries` in config for network-dependent tests
186
+
187
+ ### Visual Regression Failures
188
+ 1. Check if change is intentional
189
+ 2. If yes: `npx playwright test --update-snapshots`
190
+ 3. If no: fix the code
191
+
192
+ ### Missing Test Infrastructure
193
+ If project lacks Playwright setup:
194
+ 1. Install dependencies
195
+ 2. Create config
196
+ 3. Add basic smoke test
197
+ 4. Run to create baseline
198
+ 5. Commit as separate "test infrastructure" change
199
+
200
+ ## References
201
+
202
+ - [Playwright Docs](https://playwright.dev/)
203
+ - [Visual Regression Guide](https://playwright.dev/docs/test-snapshots)
204
+ - Example implementation: `seanyao/kids-game/e2e/`