npm - qaa-agent - Versions diffs - 1.3.0 → 1.5.0 - Mend

qaa-agent 1.3.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/.claude/commands/create-test.md +42 -4
package/.claude/commands/qa-analyze.md +8 -10
package/.claude/commands/qa-map.md +18 -7
package/.claude/commands/qa-pr.md +23 -0
package/.claude/commands/qa-validate.md +25 -3
package/.claude/skills/qa-learner/SKILL.md +8 -0
package/CLAUDE.md +23 -13
package/README.md +20 -7
package/agents/qa-pipeline-orchestrator.md +171 -10
package/agents/qaa-analyzer.md +16 -0
package/agents/qaa-bug-detective.md +2 -0
package/agents/qaa-e2e-runner.md +415 -0
package/agents/qaa-executor.md +14 -0
package/agents/qaa-planner.md +17 -1
package/agents/qaa-scanner.md +2 -0
package/agents/qaa-testid-injector.md +2 -0
package/agents/qaa-validator.md +2 -0
package/bin/install.cjs +12 -4
package/docs/COMMANDS.md +341 -0
package/docs/DEMO.md +182 -0
package/docs/TESTING.md +156 -0
package/package.json +2 -1
package/workflows/qa-pr.md +389 -0

package/agents/qa-pipeline-orchestrator.md CHANGED Viewed

@@ -5,7 +5,7 @@ Invoked by the `/qa-start` slash command (Phase 6) or directly via Task() with t
 **Pipeline stages in order:**
 ```
-scan -> analyze -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
+scan -> codebase-map -> analyze -> [testid-inject if frontend] -> plan -> generate -> validate -> [e2e-runner if E2E tests] -> [bug-detective if failures] -> deliver
 ```
 **Workflow options:**
@@ -109,31 +109,34 @@ Based on `option` value from init, select the stage sequence. Each option shares
 **Option 1 stages:**
 ```
-scan(dev) -> analyze(full) -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
+scan(dev) -> codebase-map -> analyze(full) -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
 ```
 - Scanner: scan DEV repo only
+- Codebase Map: deep-scan codebase for testability, risk, patterns, existing tests
 - Analyzer: mode='full' (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
-- Planner: reads TEST_INVENTORY.md + QA_ANALYSIS.md
-- Executor: generates all planned test files
+- Planner: reads TEST_INVENTORY.md + QA_ANALYSIS.md + codebase map documents
+- Executor: generates all planned test files using codebase map for context
 - All stages run against DEV repo artifacts
 **Option 2 stages:**
 ```
-scan(both) -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(gap) -> validate -> [bug-detective if failures] -> deliver
+scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(gap) -> validate -> [bug-detective if failures] -> deliver
 ```
 - Scanner: scan BOTH dev_repo_path and qa_repo_path
+- Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
 - Analyzer: mode='gap' (produces GAP_ANALYSIS.md)
-- Planner: reads GAP_ANALYSIS.md (fix broken first, then add missing, then standardize)
+- Planner: reads GAP_ANALYSIS.md + codebase map documents (fix broken first, then add missing, then standardize)
 - Executor: generates fixed test files + new test files + standardized files
 - All stages aware of existing QA repo structure
 **Option 3 stages:**
 ```
-scan(both) -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective if failures] -> deliver
+scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective if failures] -> deliver
 ```
 - Scanner: scan BOTH dev_repo_path and qa_repo_path
+- Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
 - Analyzer: mode='gap' (produces GAP_ANALYSIS.md with thin areas only)
-- Planner: reads GAP_ANALYSIS.md (missing tests only)
+- Planner: reads GAP_ANALYSIS.md + codebase map documents (missing tests only)
 - Executor: passes `skip_existing_test_ids: true` so it checks existing test files by test ID before generating -- skips tests that already exist
 - Only new test files are generated; existing working tests are left untouched
@@ -231,8 +234,79 @@ If `detection_confidence` is `LOW`:
 - If `IS_AUTO` is false: Present the detection details to user. Wait for confirmation. On user response, spawn fresh continuation agent with user's framework choice.
 </step>
+<step name="execute_codebase_map">
+## Step 4: Execute Codebase Map Stage
+Deep-scan the codebase to produce QA-oriented documents that all downstream agents consume. This gives the pipeline full knowledge of the codebase structure, testability, risk areas, code patterns, and existing test coverage before any analysis or generation happens.
+**State update -- mark codebase-map as running:**
+```bash
+node bin/qaa-tools.cjs state patch --"Map Status" running --"Status" "Mapping codebase"
+```
+**Print stage banner:**
+```
++------------------------------------------+
+|  STAGE 2: Codebase Map                   |
+|  Status: Running...                      |
++------------------------------------------+
+```
+**Spawn 4 mapper agents in parallel (one per focus area):**
+```
+For each focus in [testability, risk, patterns, existing-tests]:
+Task(
+  prompt="
+    <objective>Analyze codebase for QA purposes. Focus area: {focus}. Write documents to {output_dir}/codebase/.</objective>
+    <execution_context>@agents/qaa-codebase-mapper.md</execution_context>
+    <files_to_read>
+    - CLAUDE.md
+    - {output_dir}/SCAN_MANIFEST.md
+    </files_to_read>
+    <parameters>
+    focus: {focus}
+    dev_repo_path: {dev_repo_path}
+    output_dir: {output_dir}/codebase
+    </parameters>
+  "
+)
+```
+All 4 agents can run simultaneously -- they read the codebase but write to separate files.
+**Parse mapper returns and verify outputs exist:**
+Expected files after all 4 complete:
+```
+{output_dir}/codebase/TESTABILITY.md
+{output_dir}/codebase/TEST_SURFACE.md
+{output_dir}/codebase/RISK_MAP.md
+{output_dir}/codebase/CRITICAL_PATHS.md
+{output_dir}/codebase/CODE_PATTERNS.md
+{output_dir}/codebase/API_CONTRACTS.md
+{output_dir}/codebase/TEST_ASSESSMENT.md
+{output_dir}/codebase/COVERAGE_GAPS.md
+```
+Verify at least 4 of 8 files exist (one per focus area produces 2 files). If any focus area produced 0 files, log warning but continue -- the downstream agents treat these as optional reads.
+**State update -- mark codebase-map as complete:**
+```bash
+node bin/qaa-tools.cjs state patch --"Map Status" complete --"Status" "Codebase mapped"
+```
+Print: "Codebase map complete. {N} documents produced in {output_dir}/codebase/."
+**Set codebase_map_dir for downstream stages:**
+```
+codebase_map_dir = "{output_dir}/codebase"
+```
+</step>
 <step name="execute_analyze">
-## Step 4: Execute Analyze Stage
+## Step 5: Execute Analyze Stage
 **State update -- mark analyze as running:**
 ```bash
@@ -260,6 +334,10 @@ Task(
     <files_to_read>
     - {output_dir}/SCAN_MANIFEST.md
     - CLAUDE.md
+    - {codebase_map_dir}/RISK_MAP.md (if exists)
+    - {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
+    - {codebase_map_dir}/TEST_ASSESSMENT.md (if exists)
+    - {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
     </files_to_read>
     <parameters>
     mode: {mode}
@@ -267,6 +345,7 @@ Task(
     dev_repo_path: {dev_repo_path}
     qa_repo_path: {qa_repo_path or null}
     output_path: {output_dir}/
+    codebase_map_dir: {codebase_map_dir}
     </parameters>
   "
 )
@@ -398,10 +477,15 @@ Task(
     <files_to_read>
     - {input files based on option -- see above}
     - CLAUDE.md
+    - {codebase_map_dir}/TESTABILITY.md (if exists)
+    - {codebase_map_dir}/TEST_SURFACE.md (if exists)
+    - {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
+    - {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
     </files_to_read>
     <parameters>
     workflow_option: {option}
     output_path: {output_dir}/GENERATION_PLAN.md
+    codebase_map_dir: {codebase_map_dir}
     </parameters>
   "
 )
@@ -456,6 +540,9 @@ Task(
     - {output_dir}/GENERATION_PLAN.md
     - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
     - CLAUDE.md
+    - {codebase_map_dir}/CODE_PATTERNS.md (if exists)
+    - {codebase_map_dir}/API_CONTRACTS.md (if exists)
+    - {codebase_map_dir}/TEST_SURFACE.md (if exists)
     </files_to_read>
     <parameters>
     workflow_option: {option}
@@ -463,6 +550,7 @@ Task(
     dev_repo_path: {dev_repo_path}
     qa_repo_path: {qa_repo_path or null}
     output_path: {output_dir}/
+    codebase_map_dir: {codebase_map_dir}
     </parameters>
   "
 )
@@ -482,12 +570,16 @@ Task(
     - {output_dir}/GENERATION_PLAN.md
     - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
     - CLAUDE.md
+    - {codebase_map_dir}/CODE_PATTERNS.md (if exists)
+    - {codebase_map_dir}/API_CONTRACTS.md (if exists)
+    - {codebase_map_dir}/TEST_SURFACE.md (if exists)
     </files_to_read>
     <parameters>
     workflow_option: {option}
     dev_repo_path: {dev_repo_path}
     qa_repo_path: {qa_repo_path or null}
     output_path: {output_dir}/
+    codebase_map_dir: {codebase_map_dir}
     </parameters>
   "
 )
@@ -598,8 +690,77 @@ node bin/qaa-tools.cjs state patch --"Validate Status" complete --"Status" "Vali
 Print: "Validation complete. Status: {overall_status}. Confidence: {confidence}. {issues_found} issues found, {issues_fixed} fixed, {unresolved_count} unresolved."
 </step>
+<step name="execute_e2e_runner">
+## Step 9: Execute E2E Runner Stage (Conditional)
+**Condition:** Only execute if E2E test files were generated. Check if `files_created` from executor return contains any `*.e2e.spec.*` or `*.e2e.cy.*` files. Also requires a live application URL.
+If no E2E files were generated:
+Print: "Skipping E2E Runner (no E2E test files generated)." and proceed to Step 10.
+**State update:**
+```bash
+node bin/qaa-tools.cjs state patch --"Status" "Running E2E tests against live app"
+```
+**Print stage banner:**
+```
++------------------------------------------+
+|  STAGE 7: E2E Runner                     |
+|  Running tests against live application  |
+|  Status: Running...                      |
++------------------------------------------+
+```
+**Spawn e2e-runner agent via Task():**
+```
+Task(
+  prompt="
+    <objective>Run E2E tests against live application, capture real locators, fix mismatches, loop until pass</objective>
+    <execution_context>@agents/qaa-e2e-runner.md</execution_context>
+    <files_to_read>
+    - CLAUDE.md
+    - {e2e_test_files from executor return}
+    - {pom_files from executor return}
+    </files_to_read>
+    <parameters>
+    app_url: {app_url if provided, otherwise auto-detect}
+    output_dir: {output_dir}
+    dev_repo_path: {dev_repo_path}
+    </parameters>
+  "
+)
+```
+**Parse e2e-runner return:**
+Expected return structure:
+```
+E2E_RUNNER_COMPLETE:
+  app_url: "..."
+  total_tests: N
+  passed: N
+  failed: N
+  locator_fixes: N
+  app_bugs_found: N
+  fix_loops_used: N
+  report_path: "..."
+  screenshots: [...]
+```
+Capture `passed`, `failed`, `app_bugs_found`, `locator_fixes` for pipeline summary.
+If `app_bugs_found > 0`:
+Print: "E2E Runner found {app_bugs_found} application bug(s). See E2E_RUN_REPORT.md for details with screenshots."
+If `failed > 0` and `app_bugs_found == 0`:
+Print: "E2E Runner: {failed} test(s) still failing after {fix_loops_used} fix loops. Proceeding to Bug Detective for classification."
+Print: "E2E run complete. {passed}/{total_tests} passed. {locator_fixes} locators fixed. {app_bugs_found} app bugs found."
+</step>
 <step name="execute_bug_detective">
-## Step 9: Execute Bug Detective Stage (Conditional)
+## Step 10: Execute Bug Detective Stage (Conditional)
 **Condition:** Only execute if the validator reports test failures. Check:
 - `overall_status === 'FAIL'` in validator return, OR

package/agents/qaa-analyzer.md CHANGED Viewed

@@ -9,6 +9,15 @@ Read ALL of the following files BEFORE producing any output. The subagent MUST r
 - **templates/qa-analysis.md** -- QA_ANALYSIS output format contract. Defines the 6 required sections, field definitions per section, quality gate checklist, and a worked example. Your QA_ANALYSIS.md output must match this structure exactly.
 - **templates/test-inventory.md** -- TEST_INVENTORY output format contract. Defines the 5 required sections, per-test-case mandatory fields (all 7 for unit tests), quality gate checklist, and a worked example with 45 test cases. Your TEST_INVENTORY.md output must match this structure exactly.
 - **templates/qa-repo-blueprint.md** -- QA_REPO_BLUEPRINT format contract. Defines the 7 required sections for the repository blueprint. Produce this artifact only for Option 1 workflows.
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: language, framework choices, naming conventions, assertion style, output format, workflow preferences.
+- **Codebase map documents** (optional -- read if they exist in `{codebase_map_dir}/` or `.qa-output/codebase/`):
+  - **RISK_MAP.md** -- Business-critical paths, security-sensitive areas, data integrity risks. Use to prioritize P0 vs P1 vs P2 test targets and drive risk assessment.
+  - **CRITICAL_PATHS.md** -- Exact user flows for E2E smoke tests, happy path and error paths per critical operation. Use to define E2E test scope.
+  - **TEST_ASSESSMENT.md** -- Existing test quality, frameworks in use, patterns. Use to avoid recommending rebuilding what already works (especially for gap analysis mode).
+  - **COVERAGE_GAPS.md** -- Modules, functions, and paths with no test coverage. Use to target new tests precisely rather than duplicating existing ones.
+  If these files exist, they contain deep codebase knowledge that significantly improves analysis quality. Read them before producing output.
 - **CLAUDE.md** -- Read these specific sections:
   - **Testing Pyramid**: Target distribution (60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E)
   - **Test Spec Rules**: Every test case mandatory fields (unique ID, exact target, concrete inputs, explicit expected outcome, priority)
@@ -71,6 +80,13 @@ Read all required input files before any analysis work.
    - Module Boundaries (analyzer reads and produces)
    - Verification Commands for QA_ANALYSIS.md and TEST_INVENTORY.md
    - Read-Before-Write Rules
+6. **Read codebase map documents** (if they exist -- check `{codebase_map_dir}/` or `.qa-output/codebase/`):
+   - **RISK_MAP.md** -- Extract risk areas with severity, evidence, and testing implications. Feed directly into Risk Assessment section of QA_ANALYSIS.md.
+   - **CRITICAL_PATHS.md** -- Extract user flows and error paths. Use to define E2E smoke test scope in TEST_INVENTORY.md.
+   - **TEST_ASSESSMENT.md** -- Extract existing test quality and framework patterns. Use in gap analysis mode to avoid recommending changes to working tests.
+   - **COVERAGE_GAPS.md** -- Extract uncovered modules and functions. Use to prioritize test targets in TEST_INVENTORY.md.
+   If any of these files do not exist, proceed without them -- they are optional but significantly improve analysis quality when available.
 </step>
 <step name="assumptions_checkpoint">

package/agents/qaa-bug-detective.md CHANGED Viewed

@@ -17,6 +17,8 @@ Read ALL of the following files BEFORE classifying any failures. Do NOT skip.
 - **Test source files** (paths from orchestrator prompt or generation plan) -- The actual test files that will be executed and analyzed. Read these to understand test intent when classifying failures.
+- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: framework choices, assertion style, language preferences.
 Note: Read these files in full. Extract the decision tree, evidence field requirements, confidence level definitions, and auto-fix eligibility rules. These define your classification contract and output format.
 </required_reading>