qaa-agent 1.6.2 → 1.6.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/.claude/commands/create-test.md +164 -164
  2. package/.claude/commands/qa-audit.md +37 -37
  3. package/.claude/commands/qa-blueprint.md +54 -54
  4. package/.claude/commands/qa-fix.md +36 -36
  5. package/.claude/commands/qa-from-ticket.md +24 -24
  6. package/.claude/commands/qa-gap.md +20 -20
  7. package/.claude/commands/qa-map.md +47 -47
  8. package/.claude/commands/qa-pom.md +36 -36
  9. package/.claude/commands/qa-pr.md +23 -23
  10. package/.claude/commands/qa-pyramid.md +37 -37
  11. package/.claude/commands/qa-report.md +38 -38
  12. package/.claude/commands/qa-research.md +33 -33
  13. package/.claude/commands/qa-start.md +22 -22
  14. package/.claude/commands/qa-testid.md +19 -19
  15. package/.claude/commands/qa-validate.md +42 -42
  16. package/.claude/commands/update-test.md +58 -58
  17. package/.claude/settings.json +20 -20
  18. package/.claude/skills/qa-bug-detective/SKILL.md +122 -122
  19. package/.claude/skills/qa-learner/SKILL.md +150 -150
  20. package/.claude/skills/qa-repo-analyzer/SKILL.md +88 -88
  21. package/.claude/skills/qa-self-validator/SKILL.md +109 -109
  22. package/.claude/skills/qa-template-engine/SKILL.md +113 -113
  23. package/.claude/skills/qa-testid-injector/SKILL.md +93 -93
  24. package/.claude/skills/qa-workflow-documenter/SKILL.md +87 -87
  25. package/.mcp.json +8 -8
  26. package/CHANGELOG.md +71 -71
  27. package/CLAUDE.md +553 -553
  28. package/agents/qa-pipeline-orchestrator.md +1378 -1378
  29. package/agents/qaa-analyzer.md +524 -524
  30. package/agents/qaa-bug-detective.md +446 -446
  31. package/agents/qaa-codebase-mapper.md +935 -935
  32. package/agents/qaa-e2e-runner.md +415 -415
  33. package/agents/qaa-executor.md +651 -651
  34. package/agents/qaa-planner.md +390 -390
  35. package/agents/qaa-project-researcher.md +319 -319
  36. package/agents/qaa-scanner.md +424 -424
  37. package/agents/qaa-testid-injector.md +585 -585
  38. package/agents/qaa-validator.md +452 -452
  39. package/bin/install.cjs +198 -198
  40. package/bin/lib/commands.cjs +709 -709
  41. package/bin/lib/config.cjs +307 -307
  42. package/bin/lib/core.cjs +497 -497
  43. package/bin/lib/frontmatter.cjs +299 -299
  44. package/bin/lib/init.cjs +989 -989
  45. package/bin/lib/milestone.cjs +241 -241
  46. package/bin/lib/model-profiles.cjs +60 -60
  47. package/bin/lib/phase.cjs +911 -911
  48. package/bin/lib/roadmap.cjs +306 -306
  49. package/bin/lib/state.cjs +748 -748
  50. package/bin/lib/template.cjs +222 -222
  51. package/bin/lib/verify.cjs +842 -842
  52. package/bin/qaa-tools.cjs +607 -607
  53. package/docs/COMMANDS.md +341 -341
  54. package/docs/DEMO.md +182 -182
  55. package/docs/TESTING.md +156 -156
  56. package/package.json +41 -41
  57. package/templates/failure-classification.md +391 -391
  58. package/templates/gap-analysis.md +409 -409
  59. package/templates/pr-template.md +48 -48
  60. package/templates/qa-analysis.md +381 -381
  61. package/templates/qa-audit-report.md +465 -465
  62. package/templates/qa-repo-blueprint.md +636 -636
  63. package/templates/scan-manifest.md +312 -312
  64. package/templates/test-inventory.md +582 -582
  65. package/templates/testid-audit-report.md +354 -354
  66. package/templates/validation-report.md +243 -243
  67. package/workflows/qa-analyze.md +296 -296
  68. package/workflows/qa-from-ticket.md +536 -536
  69. package/workflows/qa-gap.md +303 -303
  70. package/workflows/qa-pr.md +389 -389
  71. package/workflows/qa-start.md +1168 -1168
  72. package/workflows/qa-testid.md +356 -356
  73. package/workflows/qa-validate.md +295 -295
package/docs/DEMO.md CHANGED
@@ -1,182 +1,182 @@
1
- # QAA — QA Automation Agent
2
-
3
- ## What is it?
4
-
5
- QAA is a multi-agent system that automates QA test creation for any software project. You point it at a codebase, and it analyzes the architecture, maps the code, generates a full test suite following industry standards, validates everything, and delivers the result as a draft pull request — ready for review.
6
-
7
- No manual test writing. No guessing what to cover. One command, full pipeline.
8
-
9
- ## The Problem
10
-
11
- Writing test suites is slow, repetitive, and often inconsistent. Teams face:
12
-
13
- - **Starting from zero is painful** — a new project with no tests means weeks of setup before the first real test runs
14
- - **Coverage gaps are invisible** — without analysis, teams don't know what's missing until something breaks in production
15
- - **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming conventions
16
- - **QA is always behind dev** — features ship faster than tests get written, and the gap keeps growing
17
- - **Existing QA teams still spend hours on repetitive work** — even with a mature test suite, adding tests for new features means manually inspecting pages, finding locators, writing POMs, running tests, fixing failures, repeat
18
-
19
- ## The Solution
20
-
21
- QAA runs a pipeline of specialized AI agents, each responsible for one stage:
22
-
23
- ```
24
- scan → map → analyze → plan → generate → validate → deliver
25
- ```
26
-
27
- | Stage | What happens | Output |
28
- |-------|-------------|--------|
29
- | **Scan** | Detects framework, language, testable surfaces | SCAN_MANIFEST.md |
30
- | **Map** | Deep-scans codebase for testability, risk, patterns, existing tests (4 parallel agents) | 8 codebase documents |
31
- | **Analyze** | Produces risk assessment, test inventory, testing pyramid | QA_ANALYSIS.md, TEST_INVENTORY.md |
32
- | **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | GENERATION_PLAN.md |
33
- | **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
34
- | **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | VALIDATION_REPORT.md |
35
- | **Deliver** | Creates branch, commits per stage, pushes, opens draft PR | Pull request URL |
36
-
37
- Every agent reads the project's QA standards (CLAUDE.md) before producing output. Every test case has a unique ID, concrete inputs, and explicit expected outcomes — never "works correctly."
38
-
39
- ## Three Workflows
40
-
41
- QAA adapts to where the project is in its QA maturity:
42
-
43
- **1. No QA repo yet** — `/qa-start --dev-repo ./myproject`
44
- Full pipeline from scratch. Produces a complete test suite, QA repo blueprint, and a draft PR with everything.
45
-
46
- **2. Immature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
47
- Scans both repos, identifies gaps, fixes broken tests, adds missing coverage, standardizes existing tests.
48
-
49
- **3. Mature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
50
- Only adds surgical test additions where coverage is thin. Doesn't touch working tests.
51
-
52
- ## The "Brain" — Codebase Map
53
-
54
- Before generating anything, QAA maps the entire codebase with 4 parallel agents:
55
-
56
- - **Testability** — what's testable, pure functions vs stateful code, mock boundaries
57
- - **Risk** — business-critical paths, security-sensitive areas, data integrity risks
58
- - **Patterns** — naming conventions, API shapes, import style, code patterns
59
- - **Existing tests** — current test quality, frameworks in use, coverage gaps
60
-
61
- These 8 documents become the shared context that every downstream agent reads. The analyzer uses risk data to prioritize tests. The planner uses testability data to estimate complexity. The executor uses code patterns to generate tests that match the project's style.
62
-
63
- Result: generated tests feel native to the codebase, not generic boilerplate.
64
-
65
- ## Day-to-Day for a QA Engineer
66
-
67
- This is where QAA shines for teams that already have a mature QA repo. The full pipeline is for bootstrapping — but the real daily value is the targeted workflow.
68
-
69
- ### The scenario
70
-
71
- You're a QA engineer. A developer just shipped a new "password reset" feature. You need tests. Here's what happens:
72
-
73
- ### Step 1: Map the codebase (once)
74
-
75
- ```
76
- /qa-map
77
- ```
78
-
79
- QAA scans the entire project and builds its "brain" — 8 documents covering testability, risk areas, API contracts, code patterns, and existing test coverage. This runs once and stays valid until the codebase changes significantly.
80
-
81
- ### Step 2: Create tests for the feature
82
-
83
- ```
84
- /create-test "password reset"
85
- ```
86
-
87
- QAA already knows the codebase. It reads the brain documents, finds the relevant source files (`auth.service.ts`, `reset.controller.ts`, the reset page component), understands the API contracts, and generates:
88
-
89
- - Unit tests for the reset token logic with concrete inputs and expected outputs
90
- - API tests for `POST /api/auth/reset-password` with real request/response shapes
91
- - E2E tests with Page Object Models that use the project's existing POM base class
92
- - Fixtures with test data (fake emails, expired tokens, invalid tokens)
93
-
94
- All following the project's naming conventions, import style, and assertion patterns.
95
-
96
- ### Step 3: Validate and fix in a loop
97
-
98
- ```
99
- /qa-validate ./tests
100
- ```
101
-
102
- The validator runs 4 layers of checks on every generated file:
103
-
104
- 1. **Syntax** — does it parse? Are imports correct?
105
- 2. **Structure** — does it follow POM rules? Are locators in the right tier?
106
- 3. **Dependencies** — do all imports resolve? Are mocks set up correctly?
107
- 4. **Logic** — are assertions concrete? Do test IDs follow the convention?
108
-
109
- If issues are found, the validator auto-fixes them and re-checks — up to 3 loops. If something still fails, the bug detective classifies it: is it an application bug, a test code error, or an environment issue?
110
-
111
- ### Step 4: Run the tests with Playwright
112
-
113
- QAA integrates with Playwright to actually execute the generated E2E tests against a running application. It opens the browser, navigates pages, fills forms, clicks buttons, and captures what happens. If a test fails, it reads the error, inspects the page state, and determines whether the locator is wrong, the page changed, or there's a real bug.
114
-
115
- The loop looks like this:
116
-
117
- ```
118
- generate → validate → run → failures? → classify → fix test code → run again → pass
119
- ```
120
-
121
- This continues until the tests pass or the issue is classified as an application bug that needs a developer fix.
122
-
123
- ### Step 5: Ship it
124
-
125
- ```
126
- /qa-pr --ticket PROJ-456 "password reset tests"
127
- ```
128
-
129
- QAA creates a branch following your team's naming convention (it asked you once and remembers forever), commits the test files, pushes, and opens a draft PR on GitHub, Azure DevOps, or GitLab — whatever your team uses. You get the link.
130
-
131
- ### The full daily flow
132
-
133
- ```
134
- /qa-map → builds the "brain" (once)
135
- /create-test "password reset" → generates tests using codebase knowledge
136
- /qa-validate ./tests/unit/auth* → validates + auto-fixes
137
- /qa-pr --ticket PROJ-456 "password reset tests" → draft PR with link
138
- ```
139
-
140
- From ticket to PR in minutes, not hours. And the tests follow the same standards as every other test in the repo because QAA read the existing patterns first.
141
-
142
- ### What about tickets?
143
-
144
- If you work from Jira, Linear, or GitHub Issues, skip the manual description:
145
-
146
- ```
147
- /qa-from-ticket https://company.atlassian.net/browse/PROJ-456
148
- ```
149
-
150
- QAA fetches the ticket, extracts acceptance criteria and edge cases, maps each criterion to test cases with a traceability matrix, generates the tests, validates them, and gives you a report showing which AC is covered by which test.
151
-
152
- ### When tests break after a deploy
153
-
154
- ```
155
- /qa-fix ./tests/e2e/checkout*
156
- ```
157
-
158
- QAA reads the failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes the test code errors. Application bugs get flagged for the dev team with evidence — the exact assertion that failed, what was expected, and what was received.
159
-
160
- ## Standards
161
-
162
- Every test artifact follows strict rules:
163
-
164
- - **Testing pyramid** — 60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E
165
- - **Locator hierarchy** — data-testid first, ARIA roles, labels, CSS as last resort (with TODO)
166
- - **Page Object Model** — one class per page, no assertions in POMs, locators as properties
167
- - **Assertions** — concrete values only. `expect(status).toBe(200)` not `expect(status).toBeTruthy()`
168
- - **Naming** — unique IDs per test case: `UT-AUTH-001`, `API-USERS-003`, `E2E-CHECKOUT-001`
169
-
170
- ## Learning System
171
-
172
- QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with feature/" — it saves the rule permanently. Next time, every agent reads your preferences before generating output.
173
-
174
- Preferences override defaults. Your team's conventions always win.
175
-
176
- ## Numbers
177
-
178
- 17 commands. 7 skills. 11 agents. 10 templates. 7 workflows.
179
-
180
- Supports GitHub, Azure DevOps, and GitLab. Works with Playwright, Cypress, Jest, Vitest, pytest, and more — detects what the project uses and matches it.
181
-
182
- One goal: you focus on building features, QAA handles the tests.
1
+ # QAA — QA Automation Agent
2
+
3
+ ## What is it?
4
+
5
+ QAA is a multi-agent system that automates QA test creation for any software project. You point it at a codebase, and it analyzes the architecture, maps the code, generates a full test suite following industry standards, validates everything, and delivers the result as a draft pull request — ready for review.
6
+
7
+ No manual test writing. No guessing what to cover. One command, full pipeline.
8
+
9
+ ## The Problem
10
+
11
+ Writing test suites is slow, repetitive, and often inconsistent. Teams face:
12
+
13
+ - **Starting from zero is painful** — a new project with no tests means weeks of setup before the first real test runs
14
+ - **Coverage gaps are invisible** — without analysis, teams don't know what's missing until something breaks in production
15
+ - **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming conventions
16
+ - **QA is always behind dev** — features ship faster than tests get written, and the gap keeps growing
17
+ - **Existing QA teams still spend hours on repetitive work** — even with a mature test suite, adding tests for new features means manually inspecting pages, finding locators, writing POMs, running tests, fixing failures, repeat
18
+
19
+ ## The Solution
20
+
21
+ QAA runs a pipeline of specialized AI agents, each responsible for one stage:
22
+
23
+ ```
24
+ scan → map → analyze → plan → generate → validate → deliver
25
+ ```
26
+
27
+ | Stage | What happens | Output |
28
+ |-------|-------------|--------|
29
+ | **Scan** | Detects framework, language, testable surfaces | SCAN_MANIFEST.md |
30
+ | **Map** | Deep-scans codebase for testability, risk, patterns, existing tests (4 parallel agents) | 8 codebase documents |
31
+ | **Analyze** | Produces risk assessment, test inventory, testing pyramid | QA_ANALYSIS.md, TEST_INVENTORY.md |
32
+ | **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | GENERATION_PLAN.md |
33
+ | **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
34
+ | **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | VALIDATION_REPORT.md |
35
+ | **Deliver** | Creates branch, commits per stage, pushes, opens draft PR | Pull request URL |
36
+
37
+ Every agent reads the project's QA standards (CLAUDE.md) before producing output. Every test case has a unique ID, concrete inputs, and explicit expected outcomes — never "works correctly."
38
+
39
+ ## Three Workflows
40
+
41
+ QAA adapts to where the project is in its QA maturity:
42
+
43
+ **1. No QA repo yet** — `/qa-start --dev-repo ./myproject`
44
+ Full pipeline from scratch. Produces a complete test suite, QA repo blueprint, and a draft PR with everything.
45
+
46
+ **2. Immature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
47
+ Scans both repos, identifies gaps, fixes broken tests, adds missing coverage, standardizes existing tests.
48
+
49
+ **3. Mature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
50
+ Only adds surgical test additions where coverage is thin. Doesn't touch working tests.
51
+
52
+ ## The "Brain" — Codebase Map
53
+
54
+ Before generating anything, QAA maps the entire codebase with 4 parallel agents:
55
+
56
+ - **Testability** — what's testable, pure functions vs stateful code, mock boundaries
57
+ - **Risk** — business-critical paths, security-sensitive areas, data integrity risks
58
+ - **Patterns** — naming conventions, API shapes, import style, code patterns
59
+ - **Existing tests** — current test quality, frameworks in use, coverage gaps
60
+
61
+ These 8 documents become the shared context that every downstream agent reads. The analyzer uses risk data to prioritize tests. The planner uses testability data to estimate complexity. The executor uses code patterns to generate tests that match the project's style.
62
+
63
+ Result: generated tests feel native to the codebase, not generic boilerplate.
64
+
65
+ ## Day-to-Day for a QA Engineer
66
+
67
+ This is where QAA shines for teams that already have a mature QA repo. The full pipeline is for bootstrapping — but the real daily value is the targeted workflow.
68
+
69
+ ### The scenario
70
+
71
+ You're a QA engineer. A developer just shipped a new "password reset" feature. You need tests. Here's what happens:
72
+
73
+ ### Step 1: Map the codebase (once)
74
+
75
+ ```
76
+ /qa-map
77
+ ```
78
+
79
+ QAA scans the entire project and builds its "brain" — 8 documents covering testability, risk areas, API contracts, code patterns, and existing test coverage. This runs once and stays valid until the codebase changes significantly.
80
+
81
+ ### Step 2: Create tests for the feature
82
+
83
+ ```
84
+ /create-test "password reset"
85
+ ```
86
+
87
+ QAA already knows the codebase. It reads the brain documents, finds the relevant source files (`auth.service.ts`, `reset.controller.ts`, the reset page component), understands the API contracts, and generates:
88
+
89
+ - Unit tests for the reset token logic with concrete inputs and expected outputs
90
+ - API tests for `POST /api/auth/reset-password` with real request/response shapes
91
+ - E2E tests with Page Object Models that use the project's existing POM base class
92
+ - Fixtures with test data (fake emails, expired tokens, invalid tokens)
93
+
94
+ All following the project's naming conventions, import style, and assertion patterns.
95
+
96
+ ### Step 3: Validate and fix in a loop
97
+
98
+ ```
99
+ /qa-validate ./tests
100
+ ```
101
+
102
+ The validator runs 4 layers of checks on every generated file:
103
+
104
+ 1. **Syntax** — does it parse? Are imports correct?
105
+ 2. **Structure** — does it follow POM rules? Are locators in the right tier?
106
+ 3. **Dependencies** — do all imports resolve? Are mocks set up correctly?
107
+ 4. **Logic** — are assertions concrete? Do test IDs follow the convention?
108
+
109
+ If issues are found, the validator auto-fixes them and re-checks — up to 3 loops. If something still fails, the bug detective classifies it: is it an application bug, a test code error, or an environment issue?
110
+
111
+ ### Step 4: Run the tests with Playwright
112
+
113
+ QAA integrates with Playwright to actually execute the generated E2E tests against a running application. It opens the browser, navigates pages, fills forms, clicks buttons, and captures what happens. If a test fails, it reads the error, inspects the page state, and determines whether the locator is wrong, the page changed, or there's a real bug.
114
+
115
+ The loop looks like this:
116
+
117
+ ```
118
+ generate → validate → run → failures? → classify → fix test code → run again → pass
119
+ ```
120
+
121
+ This continues until the tests pass or the issue is classified as an application bug that needs a developer fix.
122
+
123
+ ### Step 5: Ship it
124
+
125
+ ```
126
+ /qa-pr --ticket PROJ-456 "password reset tests"
127
+ ```
128
+
129
+ QAA creates a branch following your team's naming convention (it asked you once and remembers forever), commits the test files, pushes, and opens a draft PR on GitHub, Azure DevOps, or GitLab — whatever your team uses. You get the link.
130
+
131
+ ### The full daily flow
132
+
133
+ ```
134
+ /qa-map → builds the "brain" (once)
135
+ /create-test "password reset" → generates tests using codebase knowledge
136
+ /qa-validate ./tests/unit/auth* → validates + auto-fixes
137
+ /qa-pr --ticket PROJ-456 "password reset tests" → draft PR with link
138
+ ```
139
+
140
+ From ticket to PR in minutes, not hours. And the tests follow the same standards as every other test in the repo because QAA read the existing patterns first.
141
+
142
+ ### What about tickets?
143
+
144
+ If you work from Jira, Linear, or GitHub Issues, skip the manual description:
145
+
146
+ ```
147
+ /qa-from-ticket https://company.atlassian.net/browse/PROJ-456
148
+ ```
149
+
150
+ QAA fetches the ticket, extracts acceptance criteria and edge cases, maps each criterion to test cases with a traceability matrix, generates the tests, validates them, and gives you a report showing which AC is covered by which test.
151
+
152
+ ### When tests break after a deploy
153
+
154
+ ```
155
+ /qa-fix ./tests/e2e/checkout*
156
+ ```
157
+
158
+ QAA reads the failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes the test code errors. Application bugs get flagged for the dev team with evidence — the exact assertion that failed, what was expected, and what was received.
159
+
160
+ ## Standards
161
+
162
+ Every test artifact follows strict rules:
163
+
164
+ - **Testing pyramid** — 60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E
165
+ - **Locator hierarchy** — data-testid first, ARIA roles, labels, CSS as last resort (with TODO)
166
+ - **Page Object Model** — one class per page, no assertions in POMs, locators as properties
167
+ - **Assertions** — concrete values only. `expect(status).toBe(200)` not `expect(status).toBeTruthy()`
168
+ - **Naming** — unique IDs per test case: `UT-AUTH-001`, `API-USERS-003`, `E2E-CHECKOUT-001`
169
+
170
+ ## Learning System
171
+
172
+ QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with feature/" — it saves the rule permanently. Next time, every agent reads your preferences before generating output.
173
+
174
+ Preferences override defaults. Your team's conventions always win.
175
+
176
+ ## Numbers
177
+
178
+ 17 commands. 7 skills. 11 agents. 10 templates. 7 workflows.
179
+
180
+ Supports GitHub, Azure DevOps, and GitLab. Works with Playwright, Cypress, Jest, Vitest, pytest, and more — detects what the project uses and matches it.
181
+
182
+ One goal: you focus on building features, QAA handles the tests.