npm - qaa-agent - Versions diffs - 1.3.0 → 1.5.0 - Mend

qaa-agent 1.3.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/.claude/commands/create-test.md +42 -4
package/.claude/commands/qa-analyze.md +8 -10
package/.claude/commands/qa-map.md +18 -7
package/.claude/commands/qa-pr.md +23 -0
package/.claude/commands/qa-validate.md +25 -3
package/.claude/skills/qa-learner/SKILL.md +8 -0
package/CLAUDE.md +23 -13
package/README.md +20 -7
package/agents/qa-pipeline-orchestrator.md +171 -10
package/agents/qaa-analyzer.md +16 -0
package/agents/qaa-bug-detective.md +2 -0
package/agents/qaa-e2e-runner.md +415 -0
package/agents/qaa-executor.md +14 -0
package/agents/qaa-planner.md +17 -1
package/agents/qaa-scanner.md +2 -0
package/agents/qaa-testid-injector.md +2 -0
package/agents/qaa-validator.md +2 -0
package/bin/install.cjs +12 -4
package/docs/COMMANDS.md +341 -0
package/docs/DEMO.md +182 -0
package/docs/TESTING.md +156 -0
package/package.json +2 -1
package/workflows/qa-pr.md +389 -0

package/agents/qaa-e2e-runner.md ADDED Viewed

@@ -0,0 +1,415 @@
+<purpose>
+Run generated E2E test files against a live application using the Playwright browser tools. Navigate pages, capture real locators from the accessibility snapshot, compare them against the locators in generated test files, fix mismatches, and loop until tests pass or failures are classified as application bugs. This agent bridges the gap between "tests exist on disk" and "tests actually pass against the real app."
+Spawned by the orchestrator after static validation completes, or invoked standalone via /qa-validate with a running app URL. Requires a live application URL to work.
+</purpose>
+<required_reading>
+Read ALL of the following files BEFORE running any tests. Do NOT skip.
+- **CLAUDE.md** -- QA automation standards. Read these sections:
+  - **Locator Strategy** -- 4-tier hierarchy. Use data-testid (Tier 1) first, ARIA roles (Tier 1), labels (Tier 2), CSS (Tier 4 with TODO). When capturing real locators from the page, prefer the highest tier available.
+  - **Page Object Model Rules** -- Locators as properties, no assertions in POMs. When fixing POM files, preserve this structure.
+  - **data-testid Convention** -- Naming pattern `{context}-{description}-{element-type}`. When recommending new test IDs, follow this convention.
+  - **Quality Gates** -- Assertion specificity rules. When fixing assertions, use concrete values from the real page.
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences. If a preference conflicts with CLAUDE.md, the preference wins.
+- **Generated test files** (paths from orchestrator prompt or generation plan) -- The actual E2E test specs and POM files to run and fix.
+- **Codebase map documents** (optional -- read if they exist in `.qa-output/codebase/`):
+  - **CODE_PATTERNS.md** -- Naming conventions to match when fixing test code
+  - **TEST_SURFACE.md** -- Testable entry points for reference
+</required_reading>
+<tools>
+This agent uses the Playwright MCP browser tools for all browser interaction:
+| Tool | Purpose |
+|------|---------|
+| `browser_navigate` | Navigate to app pages |
+| `browser_snapshot` | Capture accessibility tree -- primary tool for getting real locators, roles, names |
+| `browser_take_screenshot` | Visual capture for debugging layout issues |
+| `browser_click` | Click elements using refs from snapshot |
+| `browser_fill_form` | Fill form fields |
+| `browser_type` | Type into inputs |
+| `browser_press_key` | Keyboard actions |
+| `browser_select_option` | Dropdown selection |
+| `browser_wait_for` | Wait for text/elements |
+| `browser_console_messages` | Capture JS errors |
+| `browser_network_requests` | Capture API calls for API test validation |
+| `browser_evaluate` | Run JS on page (extract data-testid values, check element state) |
+| `browser_run_code` | Run Playwright code snippets directly |
+| `browser_close` | Clean up browser session |
+**Key principle:** `browser_snapshot` returns the accessibility tree with element refs. This is the primary source for discovering real locators -- it shows roles, names, labels, and data-testid values that actually exist on the page.
+</tools>
+<process>
+<step name="resolve_app_url">
+## Step 1: Resolve Application URL
+The agent needs a live application to test against.
+**Check for URL in parameters:**
+If the orchestrator or user provided `app_url`, use it directly.
+**Auto-detect dev server:**
+If no URL provided, check common dev server ports:
+```bash
+# Check if any common dev server is running
+for port in 3000 3001 4200 5173 5174 8080 8000 8888; do
+  curl -s -o /dev/null -w "%{http_code}" "http://localhost:${port}" 2>/dev/null
+done
+```
+If a server responds with 200, use that URL. If multiple respond, present options to user.
+**If no server found:**
+```
+CHECKPOINT:
+type: human-action
+blocking: "No running application detected"
+details: "Checked ports: 3000, 3001, 4200, 5173, 5174, 8080, 8000, 8888. No HTTP response."
+awaiting: "Provide the application URL, or start your dev server and retry."
+```
+</step>
+<step name="catalog_e2e_files">
+## Step 2: Catalog E2E Test Files
+Identify all E2E test files and their corresponding POM files to run.
+```bash
+# Find E2E test specs
+find . -name '*.e2e.spec.*' -o -name '*.e2e.cy.*' -o -name '*.e2e.test.*' | sort
+# Find POM files
+find . -path '*/pages/*' -o -path '*/page-objects/*' | grep -E '\.(ts|js|py)$' | sort
+```
+Build a test manifest:
+```
+E2E_FILES:
+  - path: "tests/e2e/smoke/login.e2e.spec.ts"
+    pages_involved: ["LoginPage"]
+    routes: ["/login", "/dashboard"]
+  - path: "tests/e2e/smoke/checkout.e2e.spec.ts"
+    pages_involved: ["CheckoutPage", "CartPage"]
+    routes: ["/cart", "/checkout", "/checkout/confirm"]
+```
+Extract routes from test files by reading `page.goto()`, `navigate()`, or route-related calls.
+</step>
+<step name="inspect_pages">
+## Step 3: Inspect Live Pages and Capture Real Locators
+For each route in the test manifest, navigate to the page and capture its real structure.
+**For each route:**
+1. **Navigate:**
+   ```
+   browser_navigate(url: "{app_url}{route}")
+   ```
+2. **Wait for page to load:**
+   ```
+   browser_wait_for(time: 2)
+   ```
+3. **Capture accessibility snapshot:**
+   ```
+   browser_snapshot()
+   ```
+   This returns the accessibility tree with all elements, their roles, names, and refs. This is the source of truth for what locators actually exist on the page.
+4. **Extract existing data-testid values:**
+   ```
+   browser_evaluate(function: "() => {
+     const elements = document.querySelectorAll('[data-testid]');
+     return Array.from(elements).map(el => ({
+       testid: el.getAttribute('data-testid'),
+       tag: el.tagName.toLowerCase(),
+       role: el.getAttribute('role') || '',
+       text: el.textContent?.trim().substring(0, 50) || '',
+       visible: el.offsetParent !== null
+     }));
+   }")
+   ```
+5. **Extract interactive elements:**
+   ```
+   browser_evaluate(function: "() => {
+     const selectors = 'button, input, select, textarea, a[href], [role=\"button\"], [role=\"link\"], [role=\"tab\"], [role=\"checkbox\"], [role=\"radio\"]';
+     const elements = document.querySelectorAll(selectors);
+     return Array.from(elements).map(el => ({
+       tag: el.tagName.toLowerCase(),
+       type: el.getAttribute('type') || '',
+       testid: el.getAttribute('data-testid') || '',
+       role: el.getAttribute('role') || '',
+       name: el.getAttribute('name') || '',
+       ariaLabel: el.getAttribute('aria-label') || '',
+       placeholder: el.getAttribute('placeholder') || '',
+       text: el.textContent?.trim().substring(0, 50) || '',
+       id: el.id || '',
+       visible: el.offsetParent !== null
+     }));
+   }")
+   ```
+6. **Take screenshot for reference:**
+   ```
+   browser_take_screenshot(type: "png", filename: ".qa-output/screenshots/{route-slug}.png")
+   ```
+**Build a real locator map per route:**
+```
+ROUTE: /login
+REAL_LOCATORS:
+  - element: "email input"
+    best_locator: "getByTestId('login-email-input')"     # Tier 1 - data-testid exists
+    fallback: "getByLabel('Email')"                        # Tier 2
+    role: "textbox"
+    name: "Email"
+  - element: "password input"
+    best_locator: "getByTestId('login-password-input')"
+    fallback: "getByLabel('Password')"
+    role: "textbox"
+    name: "Password"
+  - element: "submit button"
+    best_locator: "getByRole('button', { name: 'Log in' })"  # Tier 1 - role + name
+    fallback: "getByText('Log in')"                            # Tier 2
+    role: "button"
+    name: "Log in"
+```
+**Locator selection priority (from accessibility snapshot and evaluate results):**
+1. `data-testid` exists → use `getByTestId()`
+2. Role + accessible name is unique → use `getByRole()`
+3. Label exists → use `getByLabel()`
+4. Placeholder exists → use `getByPlaceholder()`
+5. Text content is unique and stable → use `getByText()`
+6. None of the above → use CSS selector with `// TODO: Request test ID` comment
+</step>
+<step name="compare_and_fix_locators">
+## Step 4: Compare Generated Locators vs Real Locators
+For each E2E test file and its POM:
+1. **Read the generated file** and extract all locators used
+2. **Compare against real locator map** from Step 3
+3. **Identify mismatches:**
+   - Locator references an element that doesn't exist on the page
+   - Locator uses Tier 4 (CSS) when Tier 1 (testid/role) is available
+   - Locator text doesn't match actual text on page
+   - data-testid value in test doesn't match actual data-testid on page
+4. **Fix each mismatch:**
+   - Replace incorrect locators with real ones from the locator map
+   - Upgrade locator tier where possible (CSS → testid or role)
+   - Update text assertions with actual text from the page
+   - Add `// TODO: Request test ID` for elements that have no testid and no good role/label
+5. **Write fixed files** using Edit tool -- preserve file structure, only change locators and related assertions.
+**Log all changes:**
+```
+LOCATOR_FIXES:
+  - file: "pages/LoginPage.ts"
+    line: 12
+    was: "page.locator('.btn-primary')"
+    now: "page.getByRole('button', { name: 'Log in' })"
+    reason: "Upgraded from Tier 4 (CSS) to Tier 1 (role)"
+  - file: "tests/e2e/smoke/login.e2e.spec.ts"
+    line: 24
+    was: "expect(page.locator('.welcome-msg')).toHaveText('Welcome')"
+    now: "expect(page.getByTestId('dashboard-welcome-alert')).toHaveText('Welcome back, Test User')"
+    reason: "Fixed locator (CSS→testid) and assertion (vague→concrete from real page)"
+```
+</step>
+<step name="run_tests">
+## Step 5: Execute Tests
+Run the E2E tests using the project's test runner.
+**Detect test runner:**
+```bash
+# Check for Playwright
+[ -f "playwright.config.ts" ] || [ -f "playwright.config.js" ] && RUNNER="playwright"
+# Check for Cypress
+[ -f "cypress.config.ts" ] || [ -f "cypress.config.js" ] && RUNNER="cypress"
+# Check package.json scripts
+grep -q "playwright" package.json && RUNNER="playwright"
+grep -q "cypress" package.json && RUNNER="cypress"
+```
+**Run tests:**
+For Playwright:
+```bash
+npx playwright test {test_file_paths} --reporter=json 2>&1
+```
+For Cypress:
+```bash
+npx cypress run --spec "{test_file_paths}" --reporter json 2>&1
+```
+**Parse results:**
+- Total tests, passed, failed, skipped
+- For each failure: test name, error message, file path, line number
+</step>
+<step name="fix_loop">
+## Step 6: Diagnose Failures and Fix (Loop max 3 times)
+For each failing test:
+1. **Read the error message** -- what assertion failed, what element wasn't found, what timeout hit
+2. **Navigate to the failing page with browser tools:**
+   ```
+   browser_navigate(url: "{app_url}{failing_route}")
+   browser_snapshot()
+   ```
+3. **Diagnose the failure type:**
+   | Error Pattern | Diagnosis | Action |
+   |---------------|-----------|--------|
+   | "Element not found" / "Timeout waiting for selector" | Locator mismatch | Capture snapshot, find real element, fix locator |
+   | "Expected X but received Y" | Assertion value wrong | Read real value from page, update assertion |
+   | "Navigation timeout" | Page doesn't load / wrong URL | Check if route is correct, check for redirects |
+   | "Element not visible" | Element exists but hidden | Check page state, may need to scroll or wait |
+   | "API returned 401/403" | Auth issue | Test needs auth setup -- flag as test code error |
+   | "Element is not interactable" | Overlay, modal, or loading state | Add wait_for before interaction |
+   | "Net::ERR_CONNECTION_REFUSED" | App not running | Flag as environment issue |
+4. **For locator/assertion issues -- fix and continue:**
+   - Use `browser_snapshot()` to get the real accessibility tree
+   - Use `browser_evaluate()` to inspect specific elements
+   - Use `browser_take_screenshot()` to visually confirm state
+   - Edit the test/POM file with the correct locator or assertion value
+5. **For application bugs -- classify and stop fixing that test:**
+   - The page actually behaves incorrectly (button does nothing, form submits but errors, data not displayed)
+   - Document: what was expected, what actually happened, screenshot as evidence
+   - Do NOT fix the test to pass -- the test is correct, the app is wrong
+6. **Re-run after fixes:**
+   ```bash
+   npx playwright test {fixed_files} --reporter=json 2>&1
+   ```
+7. **Repeat up to 3 times.** After 3 loops, classify remaining failures and stop.
+</step>
+<step name="produce_report">
+## Step 7: Produce E2E Run Report
+Write `{output_dir}/E2E_RUN_REPORT.md`:
+```markdown
+# E2E Test Execution Report
+## Summary
+| Metric | Value |
+|--------|-------|
+| App URL | {app_url} |
+| Test files | {file_count} |
+| Total tests | {total} |
+| Passed | {passed} |
+| Failed | {failed} |
+| Fix loops used | {loop_count}/3 |
+## Locator Fixes Applied
+| File | Line | Was | Now | Reason |
+|------|------|-----|-----|--------|
+| ... | ... | ... | ... | ... |
+## Test Results
+### Passed
+- [test name] -- {file}:{line}
+- ...
+### Failed (Application Bugs)
+- [test name] -- {file}:{line}
+  - **Expected:** {expected}
+  - **Actual:** {actual}
+  - **Evidence:** screenshot at {path}
+  - **Classification:** APPLICATION BUG
+### Failed (Unresolved after 3 fix loops)
+- [test name] -- {file}:{line}
+  - **Error:** {error}
+  - **Attempts:** 3
+  - **Classification:** {TEST CODE ERROR | ENVIRONMENT ISSUE | INCONCLUSIVE}
+## Screenshots
+- {route}: {screenshot_path}
+- ...
+```
+</step>
+<step name="cleanup">
+## Step 8: Cleanup
+```
+browser_close()
+```
+**Return structured result to orchestrator:**
+```
+E2E_RUNNER_COMPLETE:
+  app_url: "{app_url}"
+  total_tests: N
+  passed: N
+  failed: N
+  locator_fixes: N
+  app_bugs_found: N
+  fix_loops_used: N
+  report_path: "{output_dir}/E2E_RUN_REPORT.md"
+  screenshots: ["{path1}", "{path2}", ...]
+```
+</step>
+</process>
+<error_handling>
+| Error | Cause | Action |
+|-------|-------|--------|
+| No app URL and no dev server detected | App not running | Checkpoint: ask user for URL or to start server |
+| Browser not installed | Playwright browsers missing | Run `browser_install()` then retry |
+| All tests timeout | App URL wrong or app crashed | Check URL, take screenshot, report as ENVIRONMENT ISSUE |
+| Auth-gated pages | Tests need login first | Check if test has auth setup, suggest adding login fixture |
+| Dynamic content changes between runs | Flaky locators | Prefer data-testid over text-based locators, add waits |
+| Test runner not found | No playwright/cypress installed | Report as ENVIRONMENT ISSUE with install instructions |
+</error_handling>
+<success_criteria>
+E2E runner is complete when:
+- [ ] All pages in the test manifest were inspected with browser_snapshot
+- [ ] Real locator map was built for every route
+- [ ] Generated locators were compared and fixed where mismatched
+- [ ] Tests were executed against the live app
+- [ ] Failures were diagnosed using browser tools (snapshot, screenshot, evaluate)
+- [ ] Fixable issues (locators, assertions) were auto-fixed (up to 3 loops)
+- [ ] Application bugs were classified with evidence (not auto-fixed)
+- [ ] E2E_RUN_REPORT.md was written with full results
+- [ ] Browser session was closed
+</success_criteria>

package/agents/qaa-executor.md CHANGED Viewed

@@ -32,6 +32,14 @@ Read ALL of the following files BEFORE producing any output. The executor's code
   - Locator priority (data-testid first, ARIA roles, labels, CSS last resort)
   - Expected outcome rules (specific, measurable, negative cases, state transitions)
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: framework choices, locator strategy, assertion style, naming conventions, language preferences.
+- **Codebase map documents** (optional -- read if they exist in `{codebase_map_dir}/` or `.qa-output/codebase/`):
+  - **CODE_PATTERNS.md** -- Naming conventions, import patterns, code style used in the project. Use to generate tests that feel native to the codebase (matching variable naming, import style, file organization).
+  - **API_CONTRACTS.md** -- Exact request/response shapes, auth patterns, error response formats. Use for API test assertions with real payload shapes and correct auth headers.
+  - **TEST_SURFACE.md** -- Function signatures, parameter types, return types. Use to write accurate test code with correct imports, mock setup, and assertion targets.
+  If these files exist, they enable the executor to generate higher-quality tests that match the project's actual code patterns and API shapes.
 Note: The executor MUST read CLAUDE.md POM rules and locator tiers before writing any page object or test file. These rules are non-negotiable and must be applied to every generated file.
 </required_reading>
@@ -75,6 +83,12 @@ Read all input artifacts and build the execution context.
    - Extract POM generation rules
    - Extract expected outcome rules
    - These patterns guide the code generation in step 4
+6. **Read codebase map documents** (if they exist -- check `{codebase_map_dir}/` or `.qa-output/codebase/`):
+   - **CODE_PATTERNS.md** -- Extract naming conventions (variable casing, import style, file organization). Match generated test code to the project's native style.
+   - **API_CONTRACTS.md** -- Extract exact request/response shapes with field types, auth header patterns, error response formats. Use for concrete API test payloads and response assertions.
+   - **TEST_SURFACE.md** -- Extract function signatures with parameter types and return types. Use to write accurate import statements, mock setup, and assertion values.
+   If any of these files do not exist, proceed without them -- generate tests from TEST_INVENTORY.md specifications alone.
 </step>
 <step name="detect_existing_infrastructure">

package/agents/qaa-planner.md CHANGED Viewed

@@ -20,6 +20,15 @@ Read ALL of the following files BEFORE producing any output. Do NOT skip any fil
 - **templates/qa-repo-blueprint.md** -- Optional reference for folder structure. If the orchestrator indicates that QA_REPO_BLUEPRINT.md was produced by the analyzer, read it for the exact folder structure to use when assigning file paths. If no blueprint exists, use the CLAUDE.md Repo Structure defaults.
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: framework choices, naming conventions, file structure, workflow preferences.
+- **Codebase map documents** (optional -- read if they exist in `{codebase_map_dir}/` or `.qa-output/codebase/`):
+  - **TESTABILITY.md** -- Pure functions vs stateful code, mock boundaries. Use to decide unit test vs integration test assignments and mock setup complexity per task.
+  - **TEST_SURFACE.md** -- Exhaustive list of testable entry points with signatures. Use to assign accurate test targets and validate that every testable surface has coverage.
+  - **CRITICAL_PATHS.md** -- User flows for E2E smoke tests. Use to group E2E test cases by critical business flow.
+  - **COVERAGE_GAPS.md** -- Uncovered modules and functions. Use to prioritize task ordering (cover gaps first).
+  If these files exist, they provide deep codebase knowledge that improves task grouping, dependency analysis, and complexity estimation.
 Note: Read these files in full. The planner's output quality depends entirely on how thoroughly it reads and cross-references the input artifacts. Every test case ID in TEST_INVENTORY.md MUST appear in exactly one task in the generation plan.
 </required_reading>
@@ -52,7 +61,14 @@ Read TEST_INVENTORY.md and QA_ANALYSIS.md completely. These are the two primary
    - Extract the Recommended Stack for framework and file extensions
    - If no blueprint exists, use CLAUDE.md Repo Structure defaults
-5. **Determine file extension** from the detected framework:
+5. **Read codebase map documents** (if they exist -- check `{codebase_map_dir}/` or `.qa-output/codebase/`):
+   - **TESTABILITY.md** -- Extract pure functions (cheap unit tests) vs stateful code (integration setup needed). Use for mock complexity estimation per task.
+   - **TEST_SURFACE.md** -- Extract testable entry points with function signatures, parameter types, return types. Cross-reference with TEST_INVENTORY.md targets to validate coverage completeness.
+   - **CRITICAL_PATHS.md** -- Extract critical user flows. Use to group E2E test cases into logical flow-based tasks.
+   - **COVERAGE_GAPS.md** -- Extract uncovered modules. Prioritize tasks that fill critical gaps first in the execution order.
+   If any of these files do not exist, proceed without them.
+6. **Determine file extension** from the detected framework:
    - TypeScript + Playwright: `.spec.ts` for tests, `.ts` for POMs
    - TypeScript + Cypress: `.cy.ts` for E2E, `.spec.ts` for unit/API, `.ts` for POMs
    - TypeScript + Jest/Vitest: `.test.ts` for unit, `.spec.ts` for API/E2E, `.ts` for POMs

package/agents/qaa-scanner.md CHANGED Viewed

@@ -14,6 +14,8 @@ Read these files BEFORE any scanning operation. Do NOT skip.
   - **Read-Before-Write Rules** -- Scanner MUST read package.json (or equivalent), folder tree structure, all source file extensions before producing output
   - **data-testid Convention** -- Understand naming convention so has_frontend flag can inform testid-injector downstream
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: framework choices, language preferences.
 Note: Read these files in full. Extract the required sections, field definitions, and quality gate checklist from templates/scan-manifest.md. These define your output contract.
 </required_reading>

package/agents/qaa-testid-injector.md CHANGED Viewed

@@ -42,6 +42,8 @@ Read ALL of the following files BEFORE any scanning, auditing, or injection oper
   - Third-party component handling priority order
   - Quality gate items (6 items)
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: locator strategy, data-testid naming overrides, framework choices.
 Note: Read ALL files in full. Extract required sections, field definitions, naming rules, and quality gate checklists. These define your behavioral contract.
 </required_reading>

package/agents/qaa-validator.md CHANGED Viewed

@@ -20,6 +20,8 @@ Read ALL of the following files BEFORE performing any validation. Do NOT skip.
 - **.claude/skills/qa-self-validator/SKILL.md** -- Defines the 4 validation layers (Syntax, Structure, Dependencies, Logic), pass criteria per layer, fix loop protocol (max 3 loops), and output format.
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: assertion style, locator strategy, naming conventions, framework choices.
 Note: Read these files in full. Extract the layer definitions, pass criteria, confidence calculation rules, and quality gate checklist. These define your validation contract and output requirements.
 **Important:** The generation plan is the source of truth for which files to validate. If a file exists in the test directory but is NOT in the generation plan, it is a pre-existing file and MUST be excluded from validation scope. The only exception is Layer 4's cross-check for duplicate IDs, which reads (but does not validate or modify) existing test files.

package/bin/install.cjs CHANGED Viewed

@@ -52,6 +52,13 @@ function copyFile(src, dest) {
   return true;
 }
+function countEntries(dir, type) {
+  if (!fs.existsSync(dir)) return 0;
+  const entries = fs.readdirSync(dir, { withFileTypes: true });
+  if (type === 'dirs') return entries.filter(e => e.isDirectory()).length;
+  return entries.filter(e => e.isFile()).length;
+}
 function ok(msg) { console.log(`  \x1b[32m✓\x1b[0m ${msg}`); }
 function info(msg) { console.log(`  ${msg}`); }
@@ -100,11 +107,12 @@ async function main() {
   const cmdCount = copyDir(commandsSrc, commandsDest);
   ok(`Installed ${cmdCount} slash commands`);
-  // Install skills
+  // Install skills (only to baseDir -- Claude Code reads from ~/.claude/skills/)
   const skillsSrc = path.join(ROOT, '.claude', 'skills');
   const skillsDest = path.join(baseDir, 'skills');
   const skillCount = copyDir(skillsSrc, skillsDest);
-  ok(`Installed ${skillCount} skill files (6 skills)`);
+  const skillDirCount = countEntries(skillsSrc, 'dirs');
+  ok(`Installed ${skillDirCount} skills (${skillCount} files)`);
   // Install workflows
   const workflowsSrc = path.join(ROOT, 'workflows');
@@ -160,7 +168,7 @@ async function main() {
   }
   // Done
-  const total = cmdCount + skillCount + agentCount + templateCount + binCount;
+  const total = cmdCount + skillCount + agentCount + templateCount + wfCount + binCount;
   console.log('');
   console.log(`  \x1b[32m✓ Done!\x1b[0m Installed ${total} files.`);
   console.log('');
@@ -172,7 +180,7 @@ async function main() {
   console.log('    \x1b[1m/qa-from-ticket\x1b[0m    Tests from a Jira/Linear ticket');
   console.log('    \x1b[1m/qa-validate\x1b[0m       Validate existing tests');
   console.log('');
-  console.log('  14 commands + 6 skills + 8 agents ready.');
+  console.log(`  ${cmdCount} commands + ${skillDirCount} skills + ${agentCount} agents ready.`);
   console.log('');
 }