npm - agent-bober - Versions diffs - 0.4.3 → 0.5.1 - Mend

agent-bober 0.4.3 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +30 -0
package/agents/bober-evaluator.md +277 -8
package/agents/bober-generator.md +155 -0
package/agents/bober-planner.md +70 -0
package/dist/cli/commands/init.js +1 -0
package/dist/cli/commands/init.js.map +1 -1
package/dist/evaluators/builtin/playwright.d.ts +11 -0
package/dist/evaluators/builtin/playwright.d.ts.map +1 -1
package/dist/evaluators/builtin/playwright.js +259 -12
package/dist/evaluators/builtin/playwright.js.map +1 -1
package/package.json +1 -1
package/skills/bober.eval/SKILL.md +145 -148
package/skills/bober.playwright/SKILL.md +429 -0
package/skills/bober.playwright/references/playwright-patterns.md +377 -0
package/skills/bober.run/SKILL.md +425 -118
package/skills/bober.sprint/SKILL.md +147 -57
package/templates/presets/nextjs/bober.config.json +2 -1

package/skills/bober.playwright/SKILL.md ADDED Viewed

@@ -0,0 +1,429 @@
+---
+name: bober.playwright
+description: "Set up Playwright E2E testing and generate test files for your project. Use to initialize Playwright config, write E2E tests for existing UI, or debug failing E2E tests."
+argument-hint: "[what to test or 'setup' to initialize]"
+handoffs:
+  - label: "Evaluate with E2E"
+    command: /bober-eval
+    prompt: "Run evaluation including Playwright E2E tests"
+---
+# bober.playwright — Playwright E2E Testing Skill
+You are running the **bober.playwright** skill. Your job is to set up Playwright end-to-end testing in the project, generate robust E2E test files for UI features, or debug failing E2E tests. This skill operates in three modes depending on the argument provided.
+## Mode Detection
+Determine the mode from the argument:
+- **`setup`** or no argument with no existing `playwright.config.ts` -> **Setup Mode**
+- **`debug`** -> **Debug Mode**
+- **Any other argument** (e.g., `"test the login flow"`, `"test the dashboard"`) -> **Generate Tests Mode**
+---
+## Mode 1: Setup (`/bober-playwright setup`)
+Initialize Playwright E2E testing infrastructure in the project.
+### Step 1: Check Prerequisites
+1. Verify `package.json` exists (this is a Node.js project).
+2. Check if Playwright is already installed:
+   ```bash
+   grep -q '@playwright/test' package.json 2>/dev/null
+   ```
+3. If already installed, inform the user and ask if they want to reconfigure.
+### Step 2: Install Playwright
+```bash
+npm install -D @playwright/test
+npx playwright install chromium
+```
+Only install `chromium` — it is the fastest and most reliable for CI. Users can add more browsers later.
+### Step 3: Read Project Configuration
+Read `bober.config.json` to determine:
+- `commands.dev`: The dev server start command (e.g., `npm run dev`)
+- The dev server port: Parse it from the dev command or default to `3000`
+Also check for common framework configurations:
+- `vite.config.ts` -> default port is typically `5173`
+- `next.config.js` / `next.config.ts` -> default port is typically `3000`
+- `package.json` scripts -> look for port flags in the dev command
+### Step 4: Create `playwright.config.ts`
+Create the Playwright configuration file at the project root:
+```typescript
+import { defineConfig, devices } from '@playwright/test';
+/**
+ * Playwright E2E test configuration.
+ * Generated by bober.playwright — edit freely.
+ *
+ * @see https://playwright.dev/docs/test-configuration
+ */
+export default defineConfig({
+  testDir: './e2e',
+  fullyParallel: false,
+  forbidOnly: !!process.env.CI,
+  retries: 1,
+  workers: 1,
+  reporter: [
+    ['list'],
+    ['json', { outputFile: 'e2e-results/results.json' }],
+  ],
+  use: {
+    baseURL: 'http://localhost:<PORT>',
+    trace: 'on-first-retry',
+    screenshot: 'only-on-failure',
+  },
+  projects: [
+    {
+      name: 'chromium',
+      use: { ...devices['Desktop Chrome'] },
+    },
+  ],
+  webServer: {
+    command: '<DEV_COMMAND>',
+    port: <PORT>,
+    reuseExistingServer: !process.env.CI,
+    timeout: 30_000,
+  },
+});
+```
+Replace `<PORT>` with the detected port and `<DEV_COMMAND>` with the dev command from `bober.config.json` (e.g., `npm run dev`).
+### Step 5: Create the `e2e/` Directory and Example Test
+Create `e2e/example.spec.ts`:
+```typescript
+import { test, expect } from '@playwright/test';
+test.describe('Smoke Test', () => {
+  test('page loads successfully', async ({ page }) => {
+    await page.goto('/');
+    await page.waitForLoadState('networkidle');
+    // Verify the page has a title
+    const title = await page.title();
+    expect(title).toBeTruthy();
+  });
+  test('page has no console errors', async ({ page }) => {
+    const errors: string[] = [];
+    page.on('console', (msg) => {
+      if (msg.type() === 'error') {
+        errors.push(msg.text());
+      }
+    });
+    await page.goto('/');
+    await page.waitForLoadState('networkidle');
+    expect(errors).toEqual([]);
+  });
+});
+```
+### Step 6: Create `e2e-results/` Directory and Update `.gitignore`
+```bash
+mkdir -p e2e-results
+```
+Add to `.gitignore` (check if entries already exist before appending):
+```
+# Playwright
+e2e-results/
+test-results/
+playwright-report/
+blob-report/
+```
+### Step 7: Verify the Setup
+Run Playwright to confirm the installation works:
+```bash
+npx playwright test --reporter=list 2>&1 | head -50
+```
+If the dev server fails to start, report the error and suggest the user verify their `commands.dev` configuration.
+### Step 8: Report
+```
+## Playwright Setup Complete
+- Installed: @playwright/test + chromium browser
+- Config: playwright.config.ts (baseURL: http://localhost:<PORT>)
+- Test directory: e2e/
+- Example test: e2e/example.spec.ts
+- Results output: e2e-results/results.json
+### Run Tests
+npx playwright test                # run all E2E tests
+npx playwright test --ui           # open interactive UI mode
+npx playwright test --reporter=list # run with detailed output
+### Next Steps
+- Write E2E tests: /bober-playwright "test the <feature>"
+- Run full evaluation: /bober-eval
+```
+---
+## Mode 2: Generate Tests (`/bober-playwright "test the login flow"`)
+Generate Playwright E2E test files that verify specific UI features.
+### Step 1: Verify Playwright is Installed
+Check for `playwright.config.ts` at the project root. If it does not exist:
+- Tell the user to run `/bober-playwright setup` first
+- Stop execution
+### Step 2: Understand What to Test
+1. Read the user's argument to understand the target feature
+2. Read the current sprint contract from `.bober/contracts/` (most recent `in-progress` contract):
+   - Extract `successCriteria` that relate to UI behavior
+   - Note the `estimatedFiles` to understand which source files are relevant
+3. If no sprint contract exists, use the user's description directly
+### Step 3: Read Relevant Source Files
+Based on the sprint contract and user description:
+1. Read the UI component files that implement the feature
+2. Read the routing configuration (e.g., `App.tsx`, `app/` directory for Next.js)
+3. Read any API endpoints the UI interacts with
+4. Note existing `data-testid` attributes in the source code
+### Step 4: Generate Test Files
+Create test files in `e2e/` following these rules:
+**File naming:** `e2e/<feature-slug>.spec.ts`
+**Structure:**
+```typescript
+import { test, expect } from '@playwright/test';
+test.describe('Feature: <Feature Name>', () => {
+  test.beforeEach(async ({ page }) => {
+    // Navigate to the feature's page
+    await page.goto('/<route>');
+    await page.waitForLoadState('networkidle');
+  });
+  test('<success criterion description>', async ({ page }) => {
+    // Use data-testid selectors exclusively
+    const element = page.getByTestId('<testid>');
+    await expect(element).toBeVisible();
+    // Perform user actions
+    await page.getByTestId('<input-testid>').fill('test value');
+    await page.getByTestId('<submit-testid>').click();
+    // Assert outcomes
+    await expect(page.getByTestId('<result-testid>')).toBeVisible();
+    await expect(page.getByTestId('<result-testid>')).toHaveText(/expected text/);
+  });
+});
+```
+**Selector rules (non-negotiable):**
+- Use `page.getByTestId('...')` for all element selectors
+- Never use CSS selectors (`page.locator('.class-name')`)
+- Never use tag name selectors (`page.locator('button')`)
+- Use `page.getByRole(...)` or `page.getByText(...)` only as a fallback when `data-testid` is genuinely not appropriate (e.g., checking visible text content)
+**Assertion patterns:**
+- Use `await expect(locator).toBeVisible()` for visibility checks
+- Use `await expect(locator).toHaveText(...)` for text content
+- Use `await expect(locator).toHaveAttribute(...)` for attribute checks
+- Use `await expect(page).toHaveURL(...)` for navigation assertions
+- Use `await expect(locator).toBeDisabled()` / `toBeEnabled()` for interactive state
+- Never use raw `expect(await locator.textContent()).toBe(...)` -- use Playwright's auto-waiting assertions
+**Wait patterns:**
+- Use `page.waitForLoadState('networkidle')` after navigation in SPAs
+- Use `await expect(locator).toBeVisible()` instead of manual waits for element appearance
+- Use `page.waitForResponse(...)` when waiting for specific API calls
+- Never use `page.waitForTimeout()` -- it is flaky and unreliable
+### Step 5: Check for Missing `data-testid` Attributes
+After generating tests, check if the source components have the required `data-testid` attributes.
+For each `data-testid` used in tests, search the source files:
+```bash
+grep -r 'data-testid="<testid>"' src/ app/ components/ 2>/dev/null
+```
+If any are missing, list them clearly:
+```
+## Missing data-testid Attributes
+The following data-testid attributes must be added to your components
+for these tests to work:
+- `data-testid="login-form"` -> src/components/LoginForm.tsx
+- `data-testid="email-input"` -> src/components/LoginForm.tsx
+- `data-testid="submit-button"` -> src/components/LoginForm.tsx
+- `data-testid="error-message"` -> src/components/LoginForm.tsx
+The generator must add these attributes in the next sprint iteration.
+```
+### Step 6: Verify Tests Are Syntactically Valid
+Run a quick check:
+```bash
+npx playwright test --list 2>&1
+```
+This lists all discovered tests without executing them. If there are syntax errors, fix them.
+### Step 7: Report
+```
+## E2E Tests Generated
+Created: e2e/<feature-slug>.spec.ts
+Tests: <N> test cases covering <M> success criteria
+### Test Cases
+1. <criterion> -> <test description>
+2. <criterion> -> <test description>
+...
+### Missing data-testid Attributes
+<list if any>
+### Run Tests
+npx playwright test e2e/<feature-slug>.spec.ts --reporter=list
+```
+---
+## Mode 3: Debug (`/bober-playwright debug`)
+Analyze failing Playwright tests and provide actionable diagnostic information.
+### Step 1: Find Test Results
+Look for test results in this order:
+1. `e2e-results/results.json` (JSON reporter output)
+2. `test-results/` directory (Playwright default output with screenshots/traces)
+3. If neither exists, run the tests:
+   ```bash
+   npx playwright test --reporter=json 2>e2e-results/results.json
+   ```
+### Step 2: Parse Results
+Read `e2e-results/results.json` and extract:
+- Total tests, passed, failed, skipped
+- For each failed test:
+  - Test name and file location
+  - Error message and stack trace
+  - Screenshot path (if available in `test-results/`)
+  - Trace file path (if available)
+### Step 3: Analyze Failures
+For each failed test, determine the likely cause:
+**Category 1: Selector failures** (`locator.click: Error: strict mode violation` or `waiting for locator`)
+- The `data-testid` attribute is missing from the DOM
+- The element exists but is not visible
+- The element has not rendered yet (timing issue)
+- Suggestion: Check if the component renders the element with the expected `data-testid`
+**Category 2: Assertion failures** (`expect(received).toHaveText(expected)`)
+- The text content does not match the expected value
+- Suggestion: Check the component's rendered output, verify data flow
+**Category 3: Navigation failures** (`page.goto: net::ERR_CONNECTION_REFUSED`)
+- The dev server is not running or not reachable
+- The port is wrong
+- Suggestion: Check `playwright.config.ts` webServer config and `bober.config.json` commands.dev
+**Category 4: Timeout failures** (`Test exceeded timeout of 30000ms`)
+- The page or element takes too long to load
+- An API call is hanging
+- An infinite loop or unresolved promise in the application
+- Suggestion: Check network requests, look for hanging promises
+**Category 5: JavaScript errors** (console errors captured in test)
+- Runtime errors in the application code
+- Missing environment variables
+- Suggestion: Check the browser console output, look for unhandled errors
+### Step 4: Check Screenshots
+If screenshots exist in `test-results/`:
+```bash
+find test-results/ -name '*.png' 2>/dev/null
+```
+List them with their associated test names so the user (or generator) can inspect them.
+### Step 5: Report
+```
+## Playwright Debug Report
+### Summary
+- Total: <N> tests
+- Passed: <N>
+- Failed: <N>
+- Skipped: <N>
+### Failures
+#### 1. <Test Name>
+- File: e2e/<file>.spec.ts:<line>
+- Error: <error message>
+- Category: <selector/assertion/navigation/timeout/runtime>
+- Likely Cause: <analysis>
+- Investigate: <which source file to look at>
+- Screenshot: <path if available>
+#### 2. <Test Name>
+...
+### Suggested Actions
+- <action 1>
+- <action 2>
+...
+Note: This skill diagnoses failures but does not fix them.
+Use /bober-sprint to have the generator fix the identified issues.
+```
+---
+## General Rules
+1. **Never generate tests that use CSS class selectors.** The `data-testid` requirement is non-negotiable when Playwright is the evaluation strategy. This produces stable, refactor-resistant tests.
+2. **Never use `page.waitForTimeout()`.** It introduces flakiness. Use Playwright's auto-waiting assertions or `waitForLoadState` / `waitForResponse` instead.
+3. **One test file per feature.** Do not create a single mega-test file. Map files to sprint features or logical UI areas.
+4. **Tests must be independent.** Each test should be able to run in isolation. Use `test.beforeEach` for common setup, not test ordering.
+5. **Handle authentication state.** If tests require a logged-in user, use Playwright's `storageState` to save and reuse auth cookies. Create an `e2e/auth.setup.ts` file for the auth flow and reference it as a setup project in `playwright.config.ts`.
+6. **Keep tests focused on user behavior.** Test what users see and do, not internal implementation details. "User can submit the form and see a success message" is a good test. "The Redux store dispatches ACTION_X" is a bad test.