qaa-agent 1.3.0 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/create-test.md +42 -4
- package/.claude/commands/qa-analyze.md +8 -10
- package/.claude/commands/qa-map.md +18 -7
- package/.claude/commands/qa-pr.md +23 -0
- package/.claude/commands/qa-validate.md +25 -3
- package/.claude/skills/qa-learner/SKILL.md +8 -0
- package/CLAUDE.md +23 -13
- package/README.md +20 -7
- package/agents/qa-pipeline-orchestrator.md +171 -10
- package/agents/qaa-analyzer.md +16 -0
- package/agents/qaa-bug-detective.md +2 -0
- package/agents/qaa-e2e-runner.md +415 -0
- package/agents/qaa-executor.md +14 -0
- package/agents/qaa-planner.md +17 -1
- package/agents/qaa-scanner.md +2 -0
- package/agents/qaa-testid-injector.md +2 -0
- package/agents/qaa-validator.md +2 -0
- package/bin/install.cjs +12 -4
- package/docs/COMMANDS.md +341 -0
- package/docs/DEMO.md +182 -0
- package/docs/TESTING.md +156 -0
- package/package.json +2 -1
- package/workflows/qa-pr.md +389 -0
|
@@ -5,7 +5,7 @@ Invoked by the `/qa-start` slash command (Phase 6) or directly via Task() with t
|
|
|
5
5
|
|
|
6
6
|
**Pipeline stages in order:**
|
|
7
7
|
```
|
|
8
|
-
scan -> analyze -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
|
|
8
|
+
scan -> codebase-map -> analyze -> [testid-inject if frontend] -> plan -> generate -> validate -> [e2e-runner if E2E tests] -> [bug-detective if failures] -> deliver
|
|
9
9
|
```
|
|
10
10
|
|
|
11
11
|
**Workflow options:**
|
|
@@ -109,31 +109,34 @@ Based on `option` value from init, select the stage sequence. Each option shares
|
|
|
109
109
|
|
|
110
110
|
**Option 1 stages:**
|
|
111
111
|
```
|
|
112
|
-
scan(dev) -> analyze(full) -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
|
|
112
|
+
scan(dev) -> codebase-map -> analyze(full) -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
|
|
113
113
|
```
|
|
114
114
|
- Scanner: scan DEV repo only
|
|
115
|
+
- Codebase Map: deep-scan codebase for testability, risk, patterns, existing tests
|
|
115
116
|
- Analyzer: mode='full' (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
|
|
116
|
-
- Planner: reads TEST_INVENTORY.md + QA_ANALYSIS.md
|
|
117
|
-
- Executor: generates all planned test files
|
|
117
|
+
- Planner: reads TEST_INVENTORY.md + QA_ANALYSIS.md + codebase map documents
|
|
118
|
+
- Executor: generates all planned test files using codebase map for context
|
|
118
119
|
- All stages run against DEV repo artifacts
|
|
119
120
|
|
|
120
121
|
**Option 2 stages:**
|
|
121
122
|
```
|
|
122
|
-
scan(both) -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(gap) -> validate -> [bug-detective if failures] -> deliver
|
|
123
|
+
scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(gap) -> validate -> [bug-detective if failures] -> deliver
|
|
123
124
|
```
|
|
124
125
|
- Scanner: scan BOTH dev_repo_path and qa_repo_path
|
|
126
|
+
- Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
|
|
125
127
|
- Analyzer: mode='gap' (produces GAP_ANALYSIS.md)
|
|
126
|
-
- Planner: reads GAP_ANALYSIS.md (fix broken first, then add missing, then standardize)
|
|
128
|
+
- Planner: reads GAP_ANALYSIS.md + codebase map documents (fix broken first, then add missing, then standardize)
|
|
127
129
|
- Executor: generates fixed test files + new test files + standardized files
|
|
128
130
|
- All stages aware of existing QA repo structure
|
|
129
131
|
|
|
130
132
|
**Option 3 stages:**
|
|
131
133
|
```
|
|
132
|
-
scan(both) -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective if failures] -> deliver
|
|
134
|
+
scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective if failures] -> deliver
|
|
133
135
|
```
|
|
134
136
|
- Scanner: scan BOTH dev_repo_path and qa_repo_path
|
|
137
|
+
- Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
|
|
135
138
|
- Analyzer: mode='gap' (produces GAP_ANALYSIS.md with thin areas only)
|
|
136
|
-
- Planner: reads GAP_ANALYSIS.md (missing tests only)
|
|
139
|
+
- Planner: reads GAP_ANALYSIS.md + codebase map documents (missing tests only)
|
|
137
140
|
- Executor: passes `skip_existing_test_ids: true` so it checks existing test files by test ID before generating -- skips tests that already exist
|
|
138
141
|
- Only new test files are generated; existing working tests are left untouched
|
|
139
142
|
|
|
@@ -231,8 +234,79 @@ If `detection_confidence` is `LOW`:
|
|
|
231
234
|
- If `IS_AUTO` is false: Present the detection details to user. Wait for confirmation. On user response, spawn fresh continuation agent with user's framework choice.
|
|
232
235
|
</step>
|
|
233
236
|
|
|
237
|
+
<step name="execute_codebase_map">
|
|
238
|
+
## Step 4: Execute Codebase Map Stage
|
|
239
|
+
|
|
240
|
+
Deep-scan the codebase to produce QA-oriented documents that all downstream agents consume. This gives the pipeline full knowledge of the codebase structure, testability, risk areas, code patterns, and existing test coverage before any analysis or generation happens.
|
|
241
|
+
|
|
242
|
+
**State update -- mark codebase-map as running:**
|
|
243
|
+
```bash
|
|
244
|
+
node bin/qaa-tools.cjs state patch --"Map Status" running --"Status" "Mapping codebase"
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
**Print stage banner:**
|
|
248
|
+
```
|
|
249
|
+
+------------------------------------------+
|
|
250
|
+
| STAGE 2: Codebase Map |
|
|
251
|
+
| Status: Running... |
|
|
252
|
+
+------------------------------------------+
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
**Spawn 4 mapper agents in parallel (one per focus area):**
|
|
256
|
+
|
|
257
|
+
```
|
|
258
|
+
For each focus in [testability, risk, patterns, existing-tests]:
|
|
259
|
+
|
|
260
|
+
Task(
|
|
261
|
+
prompt="
|
|
262
|
+
<objective>Analyze codebase for QA purposes. Focus area: {focus}. Write documents to {output_dir}/codebase/.</objective>
|
|
263
|
+
<execution_context>@agents/qaa-codebase-mapper.md</execution_context>
|
|
264
|
+
<files_to_read>
|
|
265
|
+
- CLAUDE.md
|
|
266
|
+
- {output_dir}/SCAN_MANIFEST.md
|
|
267
|
+
</files_to_read>
|
|
268
|
+
<parameters>
|
|
269
|
+
focus: {focus}
|
|
270
|
+
dev_repo_path: {dev_repo_path}
|
|
271
|
+
output_dir: {output_dir}/codebase
|
|
272
|
+
</parameters>
|
|
273
|
+
"
|
|
274
|
+
)
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
All 4 agents can run simultaneously -- they read the codebase but write to separate files.
|
|
278
|
+
|
|
279
|
+
**Parse mapper returns and verify outputs exist:**
|
|
280
|
+
|
|
281
|
+
Expected files after all 4 complete:
|
|
282
|
+
```
|
|
283
|
+
{output_dir}/codebase/TESTABILITY.md
|
|
284
|
+
{output_dir}/codebase/TEST_SURFACE.md
|
|
285
|
+
{output_dir}/codebase/RISK_MAP.md
|
|
286
|
+
{output_dir}/codebase/CRITICAL_PATHS.md
|
|
287
|
+
{output_dir}/codebase/CODE_PATTERNS.md
|
|
288
|
+
{output_dir}/codebase/API_CONTRACTS.md
|
|
289
|
+
{output_dir}/codebase/TEST_ASSESSMENT.md
|
|
290
|
+
{output_dir}/codebase/COVERAGE_GAPS.md
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
Verify at least 4 of 8 files exist (one per focus area produces 2 files). If any focus area produced 0 files, log warning but continue -- the downstream agents treat these as optional reads.
|
|
294
|
+
|
|
295
|
+
**State update -- mark codebase-map as complete:**
|
|
296
|
+
```bash
|
|
297
|
+
node bin/qaa-tools.cjs state patch --"Map Status" complete --"Status" "Codebase mapped"
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
Print: "Codebase map complete. {N} documents produced in {output_dir}/codebase/."
|
|
301
|
+
|
|
302
|
+
**Set codebase_map_dir for downstream stages:**
|
|
303
|
+
```
|
|
304
|
+
codebase_map_dir = "{output_dir}/codebase"
|
|
305
|
+
```
|
|
306
|
+
</step>
|
|
307
|
+
|
|
234
308
|
<step name="execute_analyze">
|
|
235
|
-
## Step
|
|
309
|
+
## Step 5: Execute Analyze Stage
|
|
236
310
|
|
|
237
311
|
**State update -- mark analyze as running:**
|
|
238
312
|
```bash
|
|
@@ -260,6 +334,10 @@ Task(
|
|
|
260
334
|
<files_to_read>
|
|
261
335
|
- {output_dir}/SCAN_MANIFEST.md
|
|
262
336
|
- CLAUDE.md
|
|
337
|
+
- {codebase_map_dir}/RISK_MAP.md (if exists)
|
|
338
|
+
- {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
|
|
339
|
+
- {codebase_map_dir}/TEST_ASSESSMENT.md (if exists)
|
|
340
|
+
- {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
|
|
263
341
|
</files_to_read>
|
|
264
342
|
<parameters>
|
|
265
343
|
mode: {mode}
|
|
@@ -267,6 +345,7 @@ Task(
|
|
|
267
345
|
dev_repo_path: {dev_repo_path}
|
|
268
346
|
qa_repo_path: {qa_repo_path or null}
|
|
269
347
|
output_path: {output_dir}/
|
|
348
|
+
codebase_map_dir: {codebase_map_dir}
|
|
270
349
|
</parameters>
|
|
271
350
|
"
|
|
272
351
|
)
|
|
@@ -398,10 +477,15 @@ Task(
|
|
|
398
477
|
<files_to_read>
|
|
399
478
|
- {input files based on option -- see above}
|
|
400
479
|
- CLAUDE.md
|
|
480
|
+
- {codebase_map_dir}/TESTABILITY.md (if exists)
|
|
481
|
+
- {codebase_map_dir}/TEST_SURFACE.md (if exists)
|
|
482
|
+
- {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
|
|
483
|
+
- {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
|
|
401
484
|
</files_to_read>
|
|
402
485
|
<parameters>
|
|
403
486
|
workflow_option: {option}
|
|
404
487
|
output_path: {output_dir}/GENERATION_PLAN.md
|
|
488
|
+
codebase_map_dir: {codebase_map_dir}
|
|
405
489
|
</parameters>
|
|
406
490
|
"
|
|
407
491
|
)
|
|
@@ -456,6 +540,9 @@ Task(
|
|
|
456
540
|
- {output_dir}/GENERATION_PLAN.md
|
|
457
541
|
- {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
|
|
458
542
|
- CLAUDE.md
|
|
543
|
+
- {codebase_map_dir}/CODE_PATTERNS.md (if exists)
|
|
544
|
+
- {codebase_map_dir}/API_CONTRACTS.md (if exists)
|
|
545
|
+
- {codebase_map_dir}/TEST_SURFACE.md (if exists)
|
|
459
546
|
</files_to_read>
|
|
460
547
|
<parameters>
|
|
461
548
|
workflow_option: {option}
|
|
@@ -463,6 +550,7 @@ Task(
|
|
|
463
550
|
dev_repo_path: {dev_repo_path}
|
|
464
551
|
qa_repo_path: {qa_repo_path or null}
|
|
465
552
|
output_path: {output_dir}/
|
|
553
|
+
codebase_map_dir: {codebase_map_dir}
|
|
466
554
|
</parameters>
|
|
467
555
|
"
|
|
468
556
|
)
|
|
@@ -482,12 +570,16 @@ Task(
|
|
|
482
570
|
- {output_dir}/GENERATION_PLAN.md
|
|
483
571
|
- {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
|
|
484
572
|
- CLAUDE.md
|
|
573
|
+
- {codebase_map_dir}/CODE_PATTERNS.md (if exists)
|
|
574
|
+
- {codebase_map_dir}/API_CONTRACTS.md (if exists)
|
|
575
|
+
- {codebase_map_dir}/TEST_SURFACE.md (if exists)
|
|
485
576
|
</files_to_read>
|
|
486
577
|
<parameters>
|
|
487
578
|
workflow_option: {option}
|
|
488
579
|
dev_repo_path: {dev_repo_path}
|
|
489
580
|
qa_repo_path: {qa_repo_path or null}
|
|
490
581
|
output_path: {output_dir}/
|
|
582
|
+
codebase_map_dir: {codebase_map_dir}
|
|
491
583
|
</parameters>
|
|
492
584
|
"
|
|
493
585
|
)
|
|
@@ -598,8 +690,77 @@ node bin/qaa-tools.cjs state patch --"Validate Status" complete --"Status" "Vali
|
|
|
598
690
|
Print: "Validation complete. Status: {overall_status}. Confidence: {confidence}. {issues_found} issues found, {issues_fixed} fixed, {unresolved_count} unresolved."
|
|
599
691
|
</step>
|
|
600
692
|
|
|
693
|
+
<step name="execute_e2e_runner">
|
|
694
|
+
## Step 9: Execute E2E Runner Stage (Conditional)
|
|
695
|
+
|
|
696
|
+
**Condition:** Only execute if E2E test files were generated. Check if `files_created` from executor return contains any `*.e2e.spec.*` or `*.e2e.cy.*` files. Also requires a live application URL.
|
|
697
|
+
|
|
698
|
+
If no E2E files were generated:
|
|
699
|
+
Print: "Skipping E2E Runner (no E2E test files generated)." and proceed to Step 10.
|
|
700
|
+
|
|
701
|
+
**State update:**
|
|
702
|
+
```bash
|
|
703
|
+
node bin/qaa-tools.cjs state patch --"Status" "Running E2E tests against live app"
|
|
704
|
+
```
|
|
705
|
+
|
|
706
|
+
**Print stage banner:**
|
|
707
|
+
```
|
|
708
|
+
+------------------------------------------+
|
|
709
|
+
| STAGE 7: E2E Runner |
|
|
710
|
+
| Running tests against live application |
|
|
711
|
+
| Status: Running... |
|
|
712
|
+
+------------------------------------------+
|
|
713
|
+
```
|
|
714
|
+
|
|
715
|
+
**Spawn e2e-runner agent via Task():**
|
|
716
|
+
```
|
|
717
|
+
Task(
|
|
718
|
+
prompt="
|
|
719
|
+
<objective>Run E2E tests against live application, capture real locators, fix mismatches, loop until pass</objective>
|
|
720
|
+
<execution_context>@agents/qaa-e2e-runner.md</execution_context>
|
|
721
|
+
<files_to_read>
|
|
722
|
+
- CLAUDE.md
|
|
723
|
+
- {e2e_test_files from executor return}
|
|
724
|
+
- {pom_files from executor return}
|
|
725
|
+
</files_to_read>
|
|
726
|
+
<parameters>
|
|
727
|
+
app_url: {app_url if provided, otherwise auto-detect}
|
|
728
|
+
output_dir: {output_dir}
|
|
729
|
+
dev_repo_path: {dev_repo_path}
|
|
730
|
+
</parameters>
|
|
731
|
+
"
|
|
732
|
+
)
|
|
733
|
+
```
|
|
734
|
+
|
|
735
|
+
**Parse e2e-runner return:**
|
|
736
|
+
|
|
737
|
+
Expected return structure:
|
|
738
|
+
```
|
|
739
|
+
E2E_RUNNER_COMPLETE:
|
|
740
|
+
app_url: "..."
|
|
741
|
+
total_tests: N
|
|
742
|
+
passed: N
|
|
743
|
+
failed: N
|
|
744
|
+
locator_fixes: N
|
|
745
|
+
app_bugs_found: N
|
|
746
|
+
fix_loops_used: N
|
|
747
|
+
report_path: "..."
|
|
748
|
+
screenshots: [...]
|
|
749
|
+
```
|
|
750
|
+
|
|
751
|
+
Capture `passed`, `failed`, `app_bugs_found`, `locator_fixes` for pipeline summary.
|
|
752
|
+
|
|
753
|
+
If `app_bugs_found > 0`:
|
|
754
|
+
Print: "E2E Runner found {app_bugs_found} application bug(s). See E2E_RUN_REPORT.md for details with screenshots."
|
|
755
|
+
|
|
756
|
+
If `failed > 0` and `app_bugs_found == 0`:
|
|
757
|
+
Print: "E2E Runner: {failed} test(s) still failing after {fix_loops_used} fix loops. Proceeding to Bug Detective for classification."
|
|
758
|
+
|
|
759
|
+
Print: "E2E run complete. {passed}/{total_tests} passed. {locator_fixes} locators fixed. {app_bugs_found} app bugs found."
|
|
760
|
+
</step>
|
|
761
|
+
|
|
601
762
|
<step name="execute_bug_detective">
|
|
602
|
-
## Step
|
|
763
|
+
## Step 10: Execute Bug Detective Stage (Conditional)
|
|
603
764
|
|
|
604
765
|
**Condition:** Only execute if the validator reports test failures. Check:
|
|
605
766
|
- `overall_status === 'FAIL'` in validator return, OR
|
package/agents/qaa-analyzer.md
CHANGED
|
@@ -9,6 +9,15 @@ Read ALL of the following files BEFORE producing any output. The subagent MUST r
|
|
|
9
9
|
- **templates/qa-analysis.md** -- QA_ANALYSIS output format contract. Defines the 6 required sections, field definitions per section, quality gate checklist, and a worked example. Your QA_ANALYSIS.md output must match this structure exactly.
|
|
10
10
|
- **templates/test-inventory.md** -- TEST_INVENTORY output format contract. Defines the 5 required sections, per-test-case mandatory fields (all 7 for unit tests), quality gate checklist, and a worked example with 45 test cases. Your TEST_INVENTORY.md output must match this structure exactly.
|
|
11
11
|
- **templates/qa-repo-blueprint.md** -- QA_REPO_BLUEPRINT format contract. Defines the 7 required sections for the repository blueprint. Produce this artifact only for Option 1 workflows.
|
|
12
|
+
- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: language, framework choices, naming conventions, assertion style, output format, workflow preferences.
|
|
13
|
+
|
|
14
|
+
- **Codebase map documents** (optional -- read if they exist in `{codebase_map_dir}/` or `.qa-output/codebase/`):
|
|
15
|
+
- **RISK_MAP.md** -- Business-critical paths, security-sensitive areas, data integrity risks. Use to prioritize P0 vs P1 vs P2 test targets and drive risk assessment.
|
|
16
|
+
- **CRITICAL_PATHS.md** -- Exact user flows for E2E smoke tests, happy path and error paths per critical operation. Use to define E2E test scope.
|
|
17
|
+
- **TEST_ASSESSMENT.md** -- Existing test quality, frameworks in use, patterns. Use to avoid recommending rebuilding what already works (especially for gap analysis mode).
|
|
18
|
+
- **COVERAGE_GAPS.md** -- Modules, functions, and paths with no test coverage. Use to target new tests precisely rather than duplicating existing ones.
|
|
19
|
+
If these files exist, they contain deep codebase knowledge that significantly improves analysis quality. Read them before producing output.
|
|
20
|
+
|
|
12
21
|
- **CLAUDE.md** -- Read these specific sections:
|
|
13
22
|
- **Testing Pyramid**: Target distribution (60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E)
|
|
14
23
|
- **Test Spec Rules**: Every test case mandatory fields (unique ID, exact target, concrete inputs, explicit expected outcome, priority)
|
|
@@ -71,6 +80,13 @@ Read all required input files before any analysis work.
|
|
|
71
80
|
- Module Boundaries (analyzer reads and produces)
|
|
72
81
|
- Verification Commands for QA_ANALYSIS.md and TEST_INVENTORY.md
|
|
73
82
|
- Read-Before-Write Rules
|
|
83
|
+
|
|
84
|
+
6. **Read codebase map documents** (if they exist -- check `{codebase_map_dir}/` or `.qa-output/codebase/`):
|
|
85
|
+
- **RISK_MAP.md** -- Extract risk areas with severity, evidence, and testing implications. Feed directly into Risk Assessment section of QA_ANALYSIS.md.
|
|
86
|
+
- **CRITICAL_PATHS.md** -- Extract user flows and error paths. Use to define E2E smoke test scope in TEST_INVENTORY.md.
|
|
87
|
+
- **TEST_ASSESSMENT.md** -- Extract existing test quality and framework patterns. Use in gap analysis mode to avoid recommending changes to working tests.
|
|
88
|
+
- **COVERAGE_GAPS.md** -- Extract uncovered modules and functions. Use to prioritize test targets in TEST_INVENTORY.md.
|
|
89
|
+
If any of these files do not exist, proceed without them -- they are optional but significantly improve analysis quality when available.
|
|
74
90
|
</step>
|
|
75
91
|
|
|
76
92
|
<step name="assumptions_checkpoint">
|
|
@@ -17,6 +17,8 @@ Read ALL of the following files BEFORE classifying any failures. Do NOT skip.
|
|
|
17
17
|
|
|
18
18
|
- **Test source files** (paths from orchestrator prompt or generation plan) -- The actual test files that will be executed and analyzed. Read these to understand test intent when classifying failures.
|
|
19
19
|
|
|
20
|
+
- **~/.claude/qaa/MY_PREFERENCES.md** (optional -- read if exists). User's personal QA preferences saved by the qa-learner skill. If a preference conflicts with CLAUDE.md, the preference wins (it is a user override). Check for rules about: framework choices, assertion style, language preferences.
|
|
21
|
+
|
|
20
22
|
Note: Read these files in full. Extract the decision tree, evidence field requirements, confidence level definitions, and auto-fix eligibility rules. These define your classification contract and output format.
|
|
21
23
|
</required_reading>
|
|
22
24
|
|