agileflow 2.80.0 → 2.81.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,300 @@
1
+ ---
2
+ name: configuration-visual-e2e
3
+ description: Configure Visual E2E testing infrastructure with Playwright and screenshot verification
4
+ tools: Read, Write, Edit, Bash, Glob, Grep
5
+ model: haiku
6
+ compact_context:
7
+ priority: high
8
+ preserve_rules:
9
+ - "Install Playwright with npx playwright install --with-deps chromium"
10
+ - "Create playwright.config.ts with webServer config for auto-starting dev server"
11
+ - "Create tests/e2e/ directory with example test that takes screenshots"
12
+ - "Create screenshots/ directory for visual verification workflow"
13
+ - "Add test:e2e script to package.json"
14
+ - "All screenshots must be visually reviewed and renamed with 'verified-' prefix"
15
+ - "Use TodoWrite to track all 8 setup steps"
16
+ - "Run example test after setup to verify it works"
17
+ state_fields:
18
+ - playwright_installed
19
+ - config_created
20
+ - example_test_created
21
+ - screenshots_dir_created
22
+ ---
23
+
24
+ # Configuration: Visual E2E Testing
25
+
26
+ Set up Visual E2E testing infrastructure with Playwright and screenshot verification workflow for reliable UI development.
27
+
28
+ ---
29
+
30
+ ## What This Does
31
+
32
+ Visual E2E testing catches issues that functional tests miss:
33
+
34
+ 1. **Playwright Setup** - Install test runner and chromium browser
35
+ 2. **Screenshot Capture** - E2E tests capture screenshots during test runs
36
+ 3. **Visual Verification** - Claude reviews screenshots before marking UI work complete
37
+ 4. **Auto-Start Dev Server** - webServer config auto-starts dev server for tests
38
+
39
+ ---
40
+
41
+ ## Configuration Steps
42
+
43
+ ### Step 1: Check Prerequisites
44
+
45
+ ```bash
46
+ # Verify package.json exists
47
+ ls package.json
48
+ ```
49
+
50
+ If no package.json, exit with: "This project needs a package.json. Run `npm init` first."
51
+
52
+ ### Step 2: Ask User to Proceed
53
+
54
+ ```xml
55
+ <invoke name="AskUserQuestion">
56
+ <parameter name="questions">[{
57
+ "question": "Set up Visual E2E testing with Playwright?",
58
+ "header": "Visual E2E",
59
+ "multiSelect": false,
60
+ "options": [
61
+ {"label": "Yes, install Playwright (Recommended)", "description": "~300MB for chromium browser, creates tests/e2e/ and screenshots/"},
62
+ {"label": "Skip", "description": "No Visual E2E setup"}
63
+ ]
64
+ }]</parameter>
65
+ </invoke>
66
+ ```
67
+
68
+ If user selects "Skip", exit with: "Visual E2E setup skipped. Run /agileflow:configure to set up later."
69
+
70
+ ### Step 3: Ask Dev Server Configuration
71
+
72
+ ```xml
73
+ <invoke name="AskUserQuestion">
74
+ <parameter name="questions">[{
75
+ "question": "What command starts your dev server?",
76
+ "header": "Dev Server",
77
+ "multiSelect": false,
78
+ "options": [
79
+ {"label": "npm run dev", "description": "Default Next.js/Vite command"},
80
+ {"label": "npm start", "description": "Create React App default"},
81
+ {"label": "yarn dev", "description": "Yarn package manager"}
82
+ ]
83
+ }]</parameter>
84
+ </invoke>
85
+ ```
86
+
87
+ ### Step 4: Install Playwright
88
+
89
+ ```bash
90
+ # Install Playwright test runner
91
+ npm install --save-dev @playwright/test
92
+
93
+ # Install chromium browser (smallest option, ~300MB)
94
+ npx playwright install --with-deps chromium
95
+ ```
96
+
97
+ ### Step 5: Create playwright.config.ts
98
+
99
+ Create `playwright.config.ts` in project root:
100
+
101
+ ```typescript
102
+ import { defineConfig, devices } from '@playwright/test';
103
+
104
+ export default defineConfig({
105
+ testDir: './tests/e2e',
106
+
107
+ // Run tests in parallel
108
+ fullyParallel: true,
109
+
110
+ // Fail the build on CI if you accidentally left test.only
111
+ forbidOnly: !!process.env.CI,
112
+
113
+ // Retry on CI only
114
+ retries: process.env.CI ? 2 : 0,
115
+
116
+ // Opt out of parallel tests on CI
117
+ workers: process.env.CI ? 1 : undefined,
118
+
119
+ // Reporter
120
+ reporter: 'html',
121
+
122
+ use: {
123
+ // Base URL for navigation
124
+ baseURL: 'http://localhost:3000',
125
+
126
+ // Capture screenshot on every test
127
+ screenshot: 'on',
128
+
129
+ // Collect trace on failure
130
+ trace: 'on-first-retry',
131
+ },
132
+
133
+ // Configure webServer to auto-start dev server
134
+ webServer: {
135
+ command: 'npm run dev', // Replace with user's choice from Step 3
136
+ url: 'http://localhost:3000',
137
+ reuseExistingServer: !process.env.CI,
138
+ timeout: 120000,
139
+ },
140
+
141
+ projects: [
142
+ {
143
+ name: 'chromium',
144
+ use: { ...devices['Desktop Chrome'] },
145
+ },
146
+ ],
147
+ });
148
+ ```
149
+
150
+ ### Step 6: Create Directory Structure
151
+
152
+ ```bash
153
+ # Create tests/e2e directory
154
+ mkdir -p tests/e2e
155
+
156
+ # Create screenshots directory
157
+ mkdir -p screenshots
158
+ ```
159
+
160
+ ### Step 7: Create Example Test
161
+
162
+ Create `tests/e2e/visual-example.spec.ts`:
163
+
164
+ ```typescript
165
+ import { test, expect } from '@playwright/test';
166
+
167
+ test.describe('Visual Verification Examples', () => {
168
+ test('homepage loads correctly', async ({ page }) => {
169
+ await page.goto('/');
170
+
171
+ // Capture full-page screenshot for visual verification
172
+ await page.screenshot({
173
+ path: 'screenshots/homepage-full.png',
174
+ fullPage: true,
175
+ });
176
+
177
+ // Basic assertions
178
+ await expect(page).toHaveTitle(/./);
179
+ });
180
+
181
+ test('component renders correctly', async ({ page }) => {
182
+ await page.goto('/');
183
+
184
+ // Capture specific element screenshot
185
+ const header = page.locator('header').first();
186
+ if (await header.isVisible()) {
187
+ await header.screenshot({
188
+ path: 'screenshots/header-component.png',
189
+ });
190
+ }
191
+
192
+ // Verify element is visible
193
+ await expect(header).toBeVisible();
194
+ });
195
+ });
196
+ ```
197
+
198
+ ### Step 8: Add npm Scripts
199
+
200
+ Add to package.json scripts:
201
+
202
+ ```json
203
+ {
204
+ "scripts": {
205
+ "test:e2e": "playwright test",
206
+ "test:e2e:ui": "playwright test --ui",
207
+ "test:e2e:headed": "playwright test --headed"
208
+ }
209
+ }
210
+ ```
211
+
212
+ ### Step 9: Run Verification Test
213
+
214
+ ```bash
215
+ npm run test:e2e
216
+ ```
217
+
218
+ ### Step 10: Show Completion Summary
219
+
220
+ ```
221
+ Visual E2E Setup Complete
222
+
223
+ Installed:
224
+ - @playwright/test
225
+ - chromium browser
226
+
227
+ Created:
228
+ - playwright.config.ts (with webServer auto-start)
229
+ - tests/e2e/visual-example.spec.ts (example test)
230
+ - screenshots/ (for visual verification)
231
+
232
+ Added scripts to package.json:
233
+ - npm run test:e2e Run all e2e tests
234
+ - npm run test:e2e:ui Run with Playwright UI
235
+ - npm run test:e2e:headed Run with visible browser
236
+
237
+ Visual Verification Workflow:
238
+ 1. Run tests: npm run test:e2e
239
+ 2. Review screenshots in screenshots/
240
+ 3. Rename verified: mv file.png verified-file.png
241
+ 4. Verify all: node scripts/screenshot-verifier.js
242
+
243
+ Why Visual Mode?
244
+ Tests passing doesn't mean UI looks correct. A button can "work"
245
+ but be the wrong color, position, or missing entirely.
246
+ Visual verification catches these issues.
247
+ ```
248
+
249
+ ---
250
+
251
+ ## Visual Verification Workflow
252
+
253
+ After running tests:
254
+
255
+ 1. **Review screenshots**: Read each screenshot in screenshots/
256
+ 2. **Verify visually**: Check that UI looks correct
257
+ 3. **Rename verified**: `mv screenshots/homepage.png screenshots/verified-homepage.png`
258
+ 4. **Run verifier**: `node scripts/screenshot-verifier.js --path ./screenshots`
259
+
260
+ This ensures Claude actually looked at each screenshot before declaring completion.
261
+
262
+ ---
263
+
264
+ ## Integration with Ralph Loop
265
+
266
+ When using Visual Mode in Ralph Loop:
267
+
268
+ ```bash
269
+ # Initialize loop with Visual Mode
270
+ node scripts/ralph-loop.js --init --epic=EP-XXXX --visual
271
+
272
+ # Loop checks:
273
+ # 1. npm test passes
274
+ # 2. All screenshots have verified- prefix
275
+ # 3. Minimum 2 iterations completed
276
+ ```
277
+
278
+ Visual Mode prevents premature completion promises for UI work.
279
+
280
+ ---
281
+
282
+ ## Troubleshooting
283
+
284
+ **Tests fail with "No server running":**
285
+ - Ensure webServer command matches your dev server command
286
+ - Check the port number in baseURL matches your app
287
+
288
+ **Screenshots directory empty:**
289
+ - Tests must include `await page.screenshot({path: 'screenshots/...'})` calls
290
+ - Check test output for errors
291
+
292
+ **Browser not installed:**
293
+ - Run `npx playwright install --with-deps chromium`
294
+
295
+ ---
296
+
297
+ ## Related
298
+
299
+ - Playwright docs: https://playwright.dev/docs/intro
300
+ - webServer config: https://playwright.dev/docs/test-webserver
@@ -90,6 +90,7 @@ RULE #4: SYNTHESIS REQUIREMENTS
90
90
  3. Collect ALL results before synthesizing
91
91
  4. Always flag conflicts in final answer
92
92
  5. Provide recommendation with rationale
93
+ 6. 🧪 EXPERIMENTAL: For quality gates (coverage ≥ X%, tests pass), use nested loops - see "NESTED LOOP MODE" section
93
94
 
94
95
  <!-- COMPACT_SUMMARY_END -->
95
96
 
@@ -326,3 +327,168 @@ These are independent — deploying in parallel.
326
327
 
327
328
  Proceed with integration?
328
329
  ```
330
+
331
+ ---
332
+
333
+ ## NESTED LOOP MODE (Experimental)
334
+
335
+ When agents need to iterate until quality gates pass, use **nested loops**. Each agent runs its own isolated loop with quality verification.
336
+
337
+ ### When to Use
338
+
339
+ | Scenario | Use Nested Loops? |
340
+ |----------|-------------------|
341
+ | Simple implementation | No - single expert spawn |
342
+ | Need coverage threshold | Yes - agent loops until coverage met |
343
+ | Need visual verification | Yes - agent loops until screenshots verified |
344
+ | Complex multi-gate feature | Yes - each domain gets its own loop |
345
+
346
+ ### How It Works
347
+
348
+ ```
349
+ ┌─────────────────────────────────────────────────────────────┐
350
+ │ ORCHESTRATOR │
351
+ │ │
352
+ │ ┌──────────────────┐ ┌──────────────────┐ │
353
+ │ │ API Agent │ │ UI Agent │ (parallel) │
354
+ │ │ Loop: coverage │ │ Loop: visual │ │
355
+ │ │ Max: 5 iter │ │ Max: 5 iter │ ← ISOLATED │
356
+ │ └──────────────────┘ └──────────────────┘ │
357
+ │ ↓ ↓ │
358
+ │ TaskOutput TaskOutput │
359
+ │ ↓ ↓ │
360
+ │ ┌──────────────────────────────────────────────────────┐ │
361
+ │ │ SYNTHESIS + VERIFICATION │ │
362
+ │ └──────────────────────────────────────────────────────┘ │
363
+ └─────────────────────────────────────────────────────────────┘
364
+ ```
365
+
366
+ ### Spawning with Agent Loops
367
+
368
+ **Step 1: Generate loop ID and include in prompt**
369
+
370
+ ```
371
+ Task(
372
+ description: "API with coverage loop",
373
+ prompt: `Implement /api/profile endpoint.
374
+
375
+ ## AGENT LOOP ACTIVE
376
+
377
+ You have a quality gate to satisfy:
378
+ - Gate: coverage >= 80%
379
+ - Max iterations: 5
380
+ - Loop ID: abc12345
381
+
382
+ ## Workflow
383
+
384
+ 1. Implement the feature
385
+ 2. Run the gate check:
386
+ node .agileflow/scripts/agent-loop.js --check --loop-id=abc12345
387
+ 3. If check returns exit code 2 (running), iterate and improve
388
+ 4. If check returns exit code 0 (passed), you're done
389
+ 5. If check returns exit code 1 (failed), report the failure
390
+
391
+ Continue iterating until the gate passes or max iterations reached.`,
392
+ subagent_type: "agileflow-api",
393
+ run_in_background: true
394
+ )
395
+ ```
396
+
397
+ **Step 2: Initialize the loop before spawning**
398
+
399
+ Before spawning the agent, the orchestrator should document that loops are being used. The agent will initialize its own loop using:
400
+
401
+ ```bash
402
+ node .agileflow/scripts/agent-loop.js --init --gate=coverage --threshold=80 --max=5 --agent=agileflow-api --loop-id=abc12345
403
+ ```
404
+
405
+ ### Available Quality Gates
406
+
407
+ | Gate | Flag | Description |
408
+ |------|------|-------------|
409
+ | `tests` | `--gate=tests` | Run test command, pass on exit 0 |
410
+ | `coverage` | `--gate=coverage --threshold=80` | Run coverage, pass when >= threshold |
411
+ | `visual` | `--gate=visual` | Check screenshots have verified- prefix |
412
+ | `lint` | `--gate=lint` | Run lint command, pass on exit 0 |
413
+ | `types` | `--gate=types` | Run tsc --noEmit, pass on exit 0 |
414
+
415
+ ### Monitoring Progress
416
+
417
+ Read the event bus for loop status:
418
+
419
+ ```bash
420
+ # Events emitted to: docs/09-agents/bus/log.jsonl
421
+
422
+ {"type":"agent_loop","event":"init","loop_id":"abc12345","agent":"agileflow-api","gate":"coverage","threshold":80}
423
+ {"type":"agent_loop","event":"iteration","loop_id":"abc12345","iter":1,"value":65,"passed":false}
424
+ {"type":"agent_loop","event":"iteration","loop_id":"abc12345","iter":2,"value":72,"passed":false}
425
+ {"type":"agent_loop","event":"passed","loop_id":"abc12345","final_value":82,"iterations":3}
426
+ ```
427
+
428
+ ### Safety Limits
429
+
430
+ | Limit | Value | Enforced By |
431
+ |-------|-------|-------------|
432
+ | Max iterations per agent | 5 | agent-loop.js |
433
+ | Max concurrent loops | 3 | agent-loop.js |
434
+ | Timeout per loop | 10 min | agent-loop.js |
435
+ | Regression abort | 2 consecutive | agent-loop.js |
436
+ | Stall abort | 5 min no progress | agent-loop.js |
437
+
438
+ ### Example: Full Feature with Quality Gates
439
+
440
+ ```
441
+ Request: "Implement user profile with API at 80% coverage and UI with visual verification"
442
+
443
+ Parallel spawn:
444
+ - agileflow-api with coverage loop (threshold: 80%)
445
+ - agileflow-ui with visual loop
446
+
447
+ ## Agent Loop Status
448
+
449
+ ### API Expert (agileflow-api)
450
+ - Gate: coverage >= 80%
451
+ - Iterations: 3
452
+ - Progress: 65% → 72% → 82% ✓
453
+ - Status: PASSED
454
+
455
+ ### UI Expert (agileflow-ui)
456
+ - Gate: visual (screenshots verified)
457
+ - Iterations: 2
458
+ - Progress: 0/3 → 3/3 verified ✓
459
+ - Status: PASSED
460
+
461
+ ## Synthesis
462
+
463
+ Both quality gates satisfied. Feature implementation complete.
464
+
465
+ Files created:
466
+ - src/routes/profile.ts (API)
467
+ - src/components/ProfilePage.tsx (UI)
468
+ - tests/profile.test.ts (coverage)
469
+ - screenshots/verified-profile-*.png (visual)
470
+ ```
471
+
472
+ ### Abort Handling
473
+
474
+ If an agent loop fails:
475
+
476
+ 1. **Max iterations reached**: Report which gate wasn't satisfied
477
+ 2. **Regression detected**: Note that quality went down twice
478
+ 3. **Stalled**: Note no progress for 5+ minutes
479
+ 4. **Timeout**: Note 10-minute limit exceeded
480
+
481
+ ```markdown
482
+ ## Agent Loop FAILED
483
+
484
+ ### API Expert (agileflow-api)
485
+ - Gate: coverage >= 80%
486
+ - Final: 72%
487
+ - Status: FAILED (max_iterations)
488
+ - Reason: Couldn't reach 80% coverage in 5 iterations
489
+
490
+ ### Recommendation
491
+ - Review uncovered code paths
492
+ - Consider if 80% is achievable
493
+ - May need to reduce threshold or add more test cases
494
+ ```
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  description: Interactive mentor for end-to-end feature implementation
3
- argument-hint: "[EPIC=<id>] [MODE=loop] [MAX=<iterations>] [VISUAL=true]"
3
+ argument-hint: "[EPIC=<id>] [MODE=loop] [MAX=<iterations>] [VISUAL=true] [COVERAGE=<percent>]"
4
4
  compact_context:
5
5
  priority: critical
6
6
  preserve_rules:
@@ -62,6 +62,7 @@ When invoked with `MODE=loop`, babysit runs autonomously through an epic's stori
62
62
  | `MODE` | Yes | Must be `loop` for autonomous mode |
63
63
  | `MAX` | No | Max iterations (default: 20) |
64
64
  | `VISUAL` | No | Enable Visual Mode for UI development (screenshot verification) |
65
+ | `COVERAGE` | No | Enable Coverage Mode - iterate until N% test coverage reached |
65
66
 
66
67
  ### To Start Loop Mode
67
68
 
@@ -73,6 +74,9 @@ node scripts/ralph-loop.js --init --epic=EP-0042 --max=20
73
74
 
74
75
  # With Visual Mode for UI development
75
76
  node scripts/ralph-loop.js --init --epic=EP-0042 --max=20 --visual
77
+
78
+ # With Coverage Mode - iterate until 80% coverage
79
+ node scripts/ralph-loop.js --init --epic=EP-0042 --max=20 --coverage=80
76
80
  ```
77
81
 
78
82
  Or manually write to session-state.json:
@@ -86,7 +90,43 @@ Or manually write to session-state.json:
86
90
  "iteration": 0,
87
91
  "max_iterations": 20,
88
92
  "visual_mode": false,
89
- "screenshots_verified": false
93
+ "screenshots_verified": false,
94
+ "coverage_mode": false,
95
+ "coverage_threshold": 80,
96
+ "coverage_baseline": 0,
97
+ "coverage_current": 0,
98
+ "coverage_verified": false
99
+ }
100
+ }
101
+ ```
102
+
103
+ ### Coverage Mode
104
+
105
+ When `COVERAGE=<percent>` is specified, the loop adds test coverage verification:
106
+
107
+ ```
108
+ /agileflow:babysit EPIC=EP-0042 MODE=loop COVERAGE=80
109
+ ```
110
+
111
+ **Coverage Mode behavior:**
112
+ 1. After tests pass, runs coverage check command
113
+ 2. Parses `coverage/coverage-summary.json` (Jest/NYC format)
114
+ 3. Compares line coverage to threshold
115
+ 4. Requires minimum 2 iterations before completion
116
+ 5. Story completes only when coverage ≥ threshold AND confirmed
117
+
118
+ **When to use Coverage Mode:**
119
+ - Test-driven epics where coverage matters
120
+ - "Write tests until X% coverage" goals
121
+ - Batch test generation overnight
122
+
123
+ **Configuration** (optional):
124
+ Add to `docs/00-meta/agileflow-metadata.json`:
125
+ ```json
126
+ {
127
+ "ralph_loop": {
128
+ "coverage_command": "npm run test:coverage",
129
+ "coverage_report_path": "coverage/coverage-summary.json"
90
130
  }
91
131
  }
92
132
  ```
@@ -111,7 +151,7 @@ When `VISUAL=true` is specified, the loop adds screenshot verification:
111
151
  - Any work where visual appearance matters
112
152
 
113
153
  **Setup requirement:**
114
- Run `/agileflow:setup:visual-e2e` first to install Playwright and create e2e tests.
154
+ Run `/agileflow:configure` and select "Set up Visual E2E testing" to install Playwright and create e2e tests.
115
155
 
116
156
  ### Loop Control Commands
117
157
 
@@ -200,7 +240,7 @@ Analysis/Review → /agileflow:multi-expert or Task(subagent_t
200
240
  - `agileflow-api` - Endpoints, business logic
201
241
  - `agileflow-ui` - Components, styling
202
242
  - `agileflow-testing` - Tests, coverage
203
- - `agileflow-orchestrator` - Multi-domain coordination
243
+ - `agileflow-orchestrator` - Multi-domain coordination (supports nested loops for quality gates)
204
244
 
205
245
  ---
206
246
 
@@ -353,6 +393,23 @@ Task(
353
393
  )
354
394
  ```
355
395
 
396
+ **🧪 EXPERIMENTAL: Nested Loops with Quality Gates**
397
+
398
+ When you need agents to iterate until quality gates pass (coverage ≥ 80%, tests pass, etc.), the orchestrator can use **nested agent loops**. Each agent runs its own isolated loop.
399
+
400
+ ```
401
+ Task(
402
+ description: "Profile feature with quality gates",
403
+ prompt: "Implement profile with quality enforcement:
404
+ 1. API: /api/profile with COVERAGE >= 80% (agent loop)
405
+ 2. UI: ProfilePage with VISUAL verification (agent loop)
406
+ Use agent-loop.js for isolated quality iterations.",
407
+ subagent_type: "agileflow-orchestrator"
408
+ )
409
+ ```
410
+
411
+ See `orchestrator.md` → "NESTED LOOP MODE" section for full details.
412
+
356
413
  ---
357
414
 
358
415
  #### Pattern 3: Parallel Execution (Manual Coordination)
@@ -441,8 +498,6 @@ Task(
441
498
  | `docs/09-agents/status.json` | Story tracking, WIP status |
442
499
  | `docs/09-agents/session-state.json` | Session state, active command |
443
500
  | `CLAUDE.md` | Project conventions (included in full above) |
444
- | `docs/02-practices/*.md` | Implementation patterns |
445
- | `docs/04-architecture/*.md` | System design docs |
446
501
 
447
502
  ---
448
503
 
@@ -917,15 +972,6 @@ After loading context, analyze and present ranked options:
917
972
  - Session state, current story
918
973
  - Docs structure, research notes
919
974
 
920
- **Read manually for deep dives:**
921
-
922
- | Domain | Docs |
923
- |--------|------|
924
- | Database | `docs/04-architecture/database-*.md` |
925
- | API | `docs/04-architecture/api-*.md` |
926
- | UI | `docs/02-practices/styling.md` |
927
- | Testing | `docs/02-practices/testing.md` |
928
-
929
975
  **State files:**
930
976
  - `docs/09-agents/status.json` - Story tracking
931
977
  - `docs/09-agents/bus/log.jsonl` - Agent messages