npm - agileflow - Versions diffs - 2.80.0 → 2.81.0 - Mend

agileflow 2.80.0 → 2.81.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/README.md +6 -6
package/package.json +1 -1
package/scripts/agent-loop.js +765 -0
package/scripts/agileflow-configure.js +3 -1
package/scripts/agileflow-welcome.js +65 -0
package/scripts/damage-control-bash.js +22 -115
package/scripts/damage-control-edit.js +19 -156
package/scripts/damage-control-write.js +19 -156
package/scripts/lib/damage-control-utils.js +251 -0
package/scripts/obtain-context.js +57 -2
package/scripts/ralph-loop.js +230 -26
package/scripts/session-manager.js +434 -20
package/src/core/agents/configuration-visual-e2e.md +300 -0
package/src/core/agents/orchestrator.md +166 -0
package/src/core/commands/babysit.md +61 -15
package/src/core/commands/configure.md +372 -100
package/src/core/commands/session/end.md +332 -103
package/src/core/commands/setup/visual-e2e.md +0 -462

package/src/core/agents/configuration-visual-e2e.md ADDED Viewed

@@ -0,0 +1,300 @@
+---
+name: configuration-visual-e2e
+description: Configure Visual E2E testing infrastructure with Playwright and screenshot verification
+tools: Read, Write, Edit, Bash, Glob, Grep
+model: haiku
+compact_context:
+  priority: high
+  preserve_rules:
+    - "Install Playwright with npx playwright install --with-deps chromium"
+    - "Create playwright.config.ts with webServer config for auto-starting dev server"
+    - "Create tests/e2e/ directory with example test that takes screenshots"
+    - "Create screenshots/ directory for visual verification workflow"
+    - "Add test:e2e script to package.json"
+    - "All screenshots must be visually reviewed and renamed with 'verified-' prefix"
+    - "Use TodoWrite to track all 8 setup steps"
+    - "Run example test after setup to verify it works"
+  state_fields:
+    - playwright_installed
+    - config_created
+    - example_test_created
+    - screenshots_dir_created
+---
+# Configuration: Visual E2E Testing
+Set up Visual E2E testing infrastructure with Playwright and screenshot verification workflow for reliable UI development.
+---
+## What This Does
+Visual E2E testing catches issues that functional tests miss:
+1. **Playwright Setup** - Install test runner and chromium browser
+2. **Screenshot Capture** - E2E tests capture screenshots during test runs
+3. **Visual Verification** - Claude reviews screenshots before marking UI work complete
+4. **Auto-Start Dev Server** - webServer config auto-starts dev server for tests
+---
+## Configuration Steps
+### Step 1: Check Prerequisites
+```bash
+# Verify package.json exists
+ls package.json
+```
+If no package.json, exit with: "This project needs a package.json. Run `npm init` first."
+### Step 2: Ask User to Proceed
+```xml
+<invoke name="AskUserQuestion">
+<parameter name="questions">[{
+  "question": "Set up Visual E2E testing with Playwright?",
+  "header": "Visual E2E",
+  "multiSelect": false,
+  "options": [
+    {"label": "Yes, install Playwright (Recommended)", "description": "~300MB for chromium browser, creates tests/e2e/ and screenshots/"},
+    {"label": "Skip", "description": "No Visual E2E setup"}
+  ]
+}]</parameter>
+</invoke>
+```
+If user selects "Skip", exit with: "Visual E2E setup skipped. Run /agileflow:configure to set up later."
+### Step 3: Ask Dev Server Configuration
+```xml
+<invoke name="AskUserQuestion">
+<parameter name="questions">[{
+  "question": "What command starts your dev server?",
+  "header": "Dev Server",
+  "multiSelect": false,
+  "options": [
+    {"label": "npm run dev", "description": "Default Next.js/Vite command"},
+    {"label": "npm start", "description": "Create React App default"},
+    {"label": "yarn dev", "description": "Yarn package manager"}
+  ]
+}]</parameter>
+</invoke>
+```
+### Step 4: Install Playwright
+```bash
+# Install Playwright test runner
+npm install --save-dev @playwright/test
+# Install chromium browser (smallest option, ~300MB)
+npx playwright install --with-deps chromium
+```
+### Step 5: Create playwright.config.ts
+Create `playwright.config.ts` in project root:
+```typescript
+import { defineConfig, devices } from '@playwright/test';
+export default defineConfig({
+  testDir: './tests/e2e',
+  // Run tests in parallel
+  fullyParallel: true,
+  // Fail the build on CI if you accidentally left test.only
+  forbidOnly: !!process.env.CI,
+  // Retry on CI only
+  retries: process.env.CI ? 2 : 0,
+  // Opt out of parallel tests on CI
+  workers: process.env.CI ? 1 : undefined,
+  // Reporter
+  reporter: 'html',
+  use: {
+    // Base URL for navigation
+    baseURL: 'http://localhost:3000',
+    // Capture screenshot on every test
+    screenshot: 'on',
+    // Collect trace on failure
+    trace: 'on-first-retry',
+  },
+  // Configure webServer to auto-start dev server
+  webServer: {
+    command: 'npm run dev', // Replace with user's choice from Step 3
+    url: 'http://localhost:3000',
+    reuseExistingServer: !process.env.CI,
+    timeout: 120000,
+  },
+  projects: [
+    {
+      name: 'chromium',
+      use: { ...devices['Desktop Chrome'] },
+    },
+  ],
+});
+```
+### Step 6: Create Directory Structure
+```bash
+# Create tests/e2e directory
+mkdir -p tests/e2e
+# Create screenshots directory
+mkdir -p screenshots
+```
+### Step 7: Create Example Test
+Create `tests/e2e/visual-example.spec.ts`:
+```typescript
+import { test, expect } from '@playwright/test';
+test.describe('Visual Verification Examples', () => {
+  test('homepage loads correctly', async ({ page }) => {
+    await page.goto('/');
+    // Capture full-page screenshot for visual verification
+    await page.screenshot({
+      path: 'screenshots/homepage-full.png',
+      fullPage: true,
+    });
+    // Basic assertions
+    await expect(page).toHaveTitle(/./);
+  });
+  test('component renders correctly', async ({ page }) => {
+    await page.goto('/');
+    // Capture specific element screenshot
+    const header = page.locator('header').first();
+    if (await header.isVisible()) {
+      await header.screenshot({
+        path: 'screenshots/header-component.png',
+      });
+    }
+    // Verify element is visible
+    await expect(header).toBeVisible();
+  });
+});
+```
+### Step 8: Add npm Scripts
+Add to package.json scripts:
+```json
+{
+  "scripts": {
+    "test:e2e": "playwright test",
+    "test:e2e:ui": "playwright test --ui",
+    "test:e2e:headed": "playwright test --headed"
+  }
+}
+```
+### Step 9: Run Verification Test
+```bash
+npm run test:e2e
+```
+### Step 10: Show Completion Summary
+```
+Visual E2E Setup Complete
+Installed:
+- @playwright/test
+- chromium browser
+Created:
+- playwright.config.ts (with webServer auto-start)
+- tests/e2e/visual-example.spec.ts (example test)
+- screenshots/ (for visual verification)
+Added scripts to package.json:
+- npm run test:e2e         Run all e2e tests
+- npm run test:e2e:ui      Run with Playwright UI
+- npm run test:e2e:headed  Run with visible browser
+Visual Verification Workflow:
+1. Run tests: npm run test:e2e
+2. Review screenshots in screenshots/
+3. Rename verified: mv file.png verified-file.png
+4. Verify all: node scripts/screenshot-verifier.js
+Why Visual Mode?
+Tests passing doesn't mean UI looks correct. A button can "work"
+but be the wrong color, position, or missing entirely.
+Visual verification catches these issues.
+```
+---
+## Visual Verification Workflow
+After running tests:
+1. **Review screenshots**: Read each screenshot in screenshots/
+2. **Verify visually**: Check that UI looks correct
+3. **Rename verified**: `mv screenshots/homepage.png screenshots/verified-homepage.png`
+4. **Run verifier**: `node scripts/screenshot-verifier.js --path ./screenshots`
+This ensures Claude actually looked at each screenshot before declaring completion.
+---
+## Integration with Ralph Loop
+When using Visual Mode in Ralph Loop:
+```bash
+# Initialize loop with Visual Mode
+node scripts/ralph-loop.js --init --epic=EP-XXXX --visual
+# Loop checks:
+# 1. npm test passes
+# 2. All screenshots have verified- prefix
+# 3. Minimum 2 iterations completed
+```
+Visual Mode prevents premature completion promises for UI work.
+---
+## Troubleshooting
+**Tests fail with "No server running":**
+- Ensure webServer command matches your dev server command
+- Check the port number in baseURL matches your app
+**Screenshots directory empty:**
+- Tests must include `await page.screenshot({path: 'screenshots/...'})` calls
+- Check test output for errors
+**Browser not installed:**
+- Run `npx playwright install --with-deps chromium`
+---
+## Related
+- Playwright docs: https://playwright.dev/docs/intro
+- webServer config: https://playwright.dev/docs/test-webserver

package/src/core/agents/orchestrator.md CHANGED Viewed

@@ -90,6 +90,7 @@ RULE #4: SYNTHESIS REQUIREMENTS
 3. Collect ALL results before synthesizing
 4. Always flag conflicts in final answer
 5. Provide recommendation with rationale
+6. 🧪 EXPERIMENTAL: For quality gates (coverage ≥ X%, tests pass), use nested loops - see "NESTED LOOP MODE" section
 <!-- COMPACT_SUMMARY_END -->
@@ -326,3 +327,168 @@ These are independent — deploying in parallel.
 Proceed with integration?
 ```
+---
+## NESTED LOOP MODE (Experimental)
+When agents need to iterate until quality gates pass, use **nested loops**. Each agent runs its own isolated loop with quality verification.
+### When to Use
+| Scenario | Use Nested Loops? |
+|----------|-------------------|
+| Simple implementation | No - single expert spawn |
+| Need coverage threshold | Yes - agent loops until coverage met |
+| Need visual verification | Yes - agent loops until screenshots verified |
+| Complex multi-gate feature | Yes - each domain gets its own loop |
+### How It Works
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    ORCHESTRATOR                              │
+│                                                              │
+│  ┌──────────────────┐  ┌──────────────────┐                 │
+│  │ API Agent        │  │ UI Agent         │  (parallel)     │
+│  │ Loop: coverage   │  │ Loop: visual     │                 │
+│  │ Max: 5 iter      │  │ Max: 5 iter      │  ← ISOLATED     │
+│  └──────────────────┘  └──────────────────┘                 │
+│           ↓                    ↓                             │
+│      TaskOutput           TaskOutput                        │
+│           ↓                    ↓                             │
+│  ┌──────────────────────────────────────────────────────┐   │
+│  │              SYNTHESIS + VERIFICATION                 │   │
+│  └──────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────┘
+```
+### Spawning with Agent Loops
+**Step 1: Generate loop ID and include in prompt**
+```
+Task(
+  description: "API with coverage loop",
+  prompt: `Implement /api/profile endpoint.
+  ## AGENT LOOP ACTIVE
+  You have a quality gate to satisfy:
+  - Gate: coverage >= 80%
+  - Max iterations: 5
+  - Loop ID: abc12345
+  ## Workflow
+  1. Implement the feature
+  2. Run the gate check:
+     node .agileflow/scripts/agent-loop.js --check --loop-id=abc12345
+  3. If check returns exit code 2 (running), iterate and improve
+  4. If check returns exit code 0 (passed), you're done
+  5. If check returns exit code 1 (failed), report the failure
+  Continue iterating until the gate passes or max iterations reached.`,
+  subagent_type: "agileflow-api",
+  run_in_background: true
+)
+```
+**Step 2: Initialize the loop before spawning**
+Before spawning the agent, the orchestrator should document that loops are being used. The agent will initialize its own loop using:
+```bash
+node .agileflow/scripts/agent-loop.js --init --gate=coverage --threshold=80 --max=5 --agent=agileflow-api --loop-id=abc12345
+```
+### Available Quality Gates
+| Gate | Flag | Description |
+|------|------|-------------|
+| `tests` | `--gate=tests` | Run test command, pass on exit 0 |
+| `coverage` | `--gate=coverage --threshold=80` | Run coverage, pass when >= threshold |
+| `visual` | `--gate=visual` | Check screenshots have verified- prefix |
+| `lint` | `--gate=lint` | Run lint command, pass on exit 0 |
+| `types` | `--gate=types` | Run tsc --noEmit, pass on exit 0 |
+### Monitoring Progress
+Read the event bus for loop status:
+```bash
+# Events emitted to: docs/09-agents/bus/log.jsonl
+{"type":"agent_loop","event":"init","loop_id":"abc12345","agent":"agileflow-api","gate":"coverage","threshold":80}
+{"type":"agent_loop","event":"iteration","loop_id":"abc12345","iter":1,"value":65,"passed":false}
+{"type":"agent_loop","event":"iteration","loop_id":"abc12345","iter":2,"value":72,"passed":false}
+{"type":"agent_loop","event":"passed","loop_id":"abc12345","final_value":82,"iterations":3}
+```
+### Safety Limits
+| Limit | Value | Enforced By |
+|-------|-------|-------------|
+| Max iterations per agent | 5 | agent-loop.js |
+| Max concurrent loops | 3 | agent-loop.js |
+| Timeout per loop | 10 min | agent-loop.js |
+| Regression abort | 2 consecutive | agent-loop.js |
+| Stall abort | 5 min no progress | agent-loop.js |
+### Example: Full Feature with Quality Gates
+```
+Request: "Implement user profile with API at 80% coverage and UI with visual verification"
+Parallel spawn:
+- agileflow-api with coverage loop (threshold: 80%)
+- agileflow-ui with visual loop
+## Agent Loop Status
+### API Expert (agileflow-api)
+- Gate: coverage >= 80%
+- Iterations: 3
+- Progress: 65% → 72% → 82% ✓
+- Status: PASSED
+### UI Expert (agileflow-ui)
+- Gate: visual (screenshots verified)
+- Iterations: 2
+- Progress: 0/3 → 3/3 verified ✓
+- Status: PASSED
+## Synthesis
+Both quality gates satisfied. Feature implementation complete.
+Files created:
+- src/routes/profile.ts (API)
+- src/components/ProfilePage.tsx (UI)
+- tests/profile.test.ts (coverage)
+- screenshots/verified-profile-*.png (visual)
+```
+### Abort Handling
+If an agent loop fails:
+1. **Max iterations reached**: Report which gate wasn't satisfied
+2. **Regression detected**: Note that quality went down twice
+3. **Stalled**: Note no progress for 5+ minutes
+4. **Timeout**: Note 10-minute limit exceeded
+```markdown
+## Agent Loop FAILED
+### API Expert (agileflow-api)
+- Gate: coverage >= 80%
+- Final: 72%
+- Status: FAILED (max_iterations)
+- Reason: Couldn't reach 80% coverage in 5 iterations
+### Recommendation
+- Review uncovered code paths
+- Consider if 80% is achievable
+- May need to reduce threshold or add more test cases
+```

package/src/core/commands/babysit.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 description: Interactive mentor for end-to-end feature implementation
-argument-hint: "[EPIC=<id>] [MODE=loop] [MAX=<iterations>] [VISUAL=true]"
+argument-hint: "[EPIC=<id>] [MODE=loop] [MAX=<iterations>] [VISUAL=true] [COVERAGE=<percent>]"
 compact_context:
   priority: critical
   preserve_rules:
@@ -62,6 +62,7 @@ When invoked with `MODE=loop`, babysit runs autonomously through an epic's stori
 | `MODE` | Yes | Must be `loop` for autonomous mode |
 | `MAX` | No | Max iterations (default: 20) |
 | `VISUAL` | No | Enable Visual Mode for UI development (screenshot verification) |
+| `COVERAGE` | No | Enable Coverage Mode - iterate until N% test coverage reached |
 ### To Start Loop Mode
@@ -73,6 +74,9 @@ node scripts/ralph-loop.js --init --epic=EP-0042 --max=20
 # With Visual Mode for UI development
 node scripts/ralph-loop.js --init --epic=EP-0042 --max=20 --visual
+# With Coverage Mode - iterate until 80% coverage
+node scripts/ralph-loop.js --init --epic=EP-0042 --max=20 --coverage=80
 ```
 Or manually write to session-state.json:
@@ -86,7 +90,43 @@ Or manually write to session-state.json:
     "iteration": 0,
     "max_iterations": 20,
     "visual_mode": false,
-    "screenshots_verified": false
+    "screenshots_verified": false,
+    "coverage_mode": false,
+    "coverage_threshold": 80,
+    "coverage_baseline": 0,
+    "coverage_current": 0,
+    "coverage_verified": false
+  }
+}
+```
+### Coverage Mode
+When `COVERAGE=<percent>` is specified, the loop adds test coverage verification:
+```
+/agileflow:babysit EPIC=EP-0042 MODE=loop COVERAGE=80
+```
+**Coverage Mode behavior:**
+1. After tests pass, runs coverage check command
+2. Parses `coverage/coverage-summary.json` (Jest/NYC format)
+3. Compares line coverage to threshold
+4. Requires minimum 2 iterations before completion
+5. Story completes only when coverage ≥ threshold AND confirmed
+**When to use Coverage Mode:**
+- Test-driven epics where coverage matters
+- "Write tests until X% coverage" goals
+- Batch test generation overnight
+**Configuration** (optional):
+Add to `docs/00-meta/agileflow-metadata.json`:
+```json
+{
+  "ralph_loop": {
+    "coverage_command": "npm run test:coverage",
+    "coverage_report_path": "coverage/coverage-summary.json"
   }
 }
 ```
@@ -111,7 +151,7 @@ When `VISUAL=true` is specified, the loop adds screenshot verification:
 - Any work where visual appearance matters
 **Setup requirement:**
-Run `/agileflow:setup:visual-e2e` first to install Playwright and create e2e tests.
+Run `/agileflow:configure` and select "Set up Visual E2E testing" to install Playwright and create e2e tests.
 ### Loop Control Commands
@@ -200,7 +240,7 @@ Analysis/Review                   → /agileflow:multi-expert or Task(subagent_t
 - `agileflow-api` - Endpoints, business logic
 - `agileflow-ui` - Components, styling
 - `agileflow-testing` - Tests, coverage
-- `agileflow-orchestrator` - Multi-domain coordination
+- `agileflow-orchestrator` - Multi-domain coordination (supports nested loops for quality gates)
 ---
@@ -353,6 +393,23 @@ Task(
 )
 ```
+**🧪 EXPERIMENTAL: Nested Loops with Quality Gates**
+When you need agents to iterate until quality gates pass (coverage ≥ 80%, tests pass, etc.), the orchestrator can use **nested agent loops**. Each agent runs its own isolated loop.
+```
+Task(
+  description: "Profile feature with quality gates",
+  prompt: "Implement profile with quality enforcement:
+    1. API: /api/profile with COVERAGE >= 80% (agent loop)
+    2. UI: ProfilePage with VISUAL verification (agent loop)
+    Use agent-loop.js for isolated quality iterations.",
+  subagent_type: "agileflow-orchestrator"
+)
+```
+See `orchestrator.md` → "NESTED LOOP MODE" section for full details.
 ---
 #### Pattern 3: Parallel Execution (Manual Coordination)
@@ -441,8 +498,6 @@ Task(
 | `docs/09-agents/status.json` | Story tracking, WIP status |
 | `docs/09-agents/session-state.json` | Session state, active command |
 | `CLAUDE.md` | Project conventions (included in full above) |
-| `docs/02-practices/*.md` | Implementation patterns |
-| `docs/04-architecture/*.md` | System design docs |
 ---
@@ -917,15 +972,6 @@ After loading context, analyze and present ranked options:
 - Session state, current story
 - Docs structure, research notes
-**Read manually for deep dives:**
-| Domain | Docs |
-|--------|------|
-| Database | `docs/04-architecture/database-*.md` |
-| API | `docs/04-architecture/api-*.md` |
-| UI | `docs/02-practices/styling.md` |
-| Testing | `docs/02-practices/testing.md` |
 **State files:**
 - `docs/09-agents/status.json` - Story tracking
 - `docs/09-agents/bus/log.jsonl` - Agent messages