qaa-agent 1.3.0 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/create-test.md +42 -4
- package/.claude/commands/qa-analyze.md +8 -10
- package/.claude/commands/qa-map.md +18 -7
- package/.claude/commands/qa-pr.md +23 -0
- package/.claude/commands/qa-validate.md +25 -3
- package/.claude/skills/qa-learner/SKILL.md +8 -0
- package/CLAUDE.md +23 -13
- package/README.md +20 -7
- package/agents/qa-pipeline-orchestrator.md +171 -10
- package/agents/qaa-analyzer.md +16 -0
- package/agents/qaa-bug-detective.md +2 -0
- package/agents/qaa-e2e-runner.md +415 -0
- package/agents/qaa-executor.md +14 -0
- package/agents/qaa-planner.md +17 -1
- package/agents/qaa-scanner.md +2 -0
- package/agents/qaa-testid-injector.md +2 -0
- package/agents/qaa-validator.md +2 -0
- package/bin/install.cjs +12 -4
- package/docs/COMMANDS.md +341 -0
- package/docs/DEMO.md +182 -0
- package/docs/TESTING.md +156 -0
- package/package.json +2 -1
- package/workflows/qa-pr.md +389 -0
package/docs/COMMANDS.md
ADDED
|
@@ -0,0 +1,341 @@
|
|
|
1
|
+
# QAA Commands Reference
|
|
2
|
+
|
|
3
|
+
## When to Use What
|
|
4
|
+
|
|
5
|
+
```
|
|
6
|
+
"I have a project with no tests" → /qa-start
|
|
7
|
+
"I want to understand the codebase first" → /qa-map
|
|
8
|
+
"I need tests for a specific feature" → /create-test
|
|
9
|
+
"I have a Jira/Linear ticket to cover" → /qa-from-ticket
|
|
10
|
+
"I want to check if my tests are good" → /qa-validate
|
|
11
|
+
"My tests are failing after a deploy" → /qa-fix
|
|
12
|
+
"I want to ship my tests as a PR" → /qa-pr
|
|
13
|
+
"I want a full report for my manager" → /qa-report
|
|
14
|
+
"I need to know what's missing in our test suite" → /qa-gap
|
|
15
|
+
"I want to add data-testid to components" → /qa-testid
|
|
16
|
+
"I need Page Object Models" → /qa-pom
|
|
17
|
+
"I want to audit test quality" → /qa-audit
|
|
18
|
+
"I need a QA repo structure from scratch" → /qa-blueprint
|
|
19
|
+
"I want to see our testing pyramid balance" → /qa-pyramid
|
|
20
|
+
"I want to analyze without generating anything" → /qa-analyze
|
|
21
|
+
"I want to research best testing practices" → /qa-research
|
|
22
|
+
"I want to improve existing tests" → /update-test
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Full Pipeline
|
|
28
|
+
|
|
29
|
+
### /qa-start
|
|
30
|
+
|
|
31
|
+
**What it does:** Runs the entire QA automation pipeline end-to-end. Scans the repo, maps the codebase, analyzes architecture, plans test cases, generates test files, validates them, runs E2E tests against the live app, and delivers everything as a draft PR.
|
|
32
|
+
|
|
33
|
+
**When to use:** Starting QA from scratch on a project, or doing a full gap-fill on an existing QA repo.
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
/qa-start --dev-repo /path/to/project
|
|
37
|
+
/qa-start --dev-repo /path/to/project --qa-repo /path/to/tests
|
|
38
|
+
/qa-start --dev-repo /path/to/project --auto
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
**Pipeline:**
|
|
42
|
+
```
|
|
43
|
+
scan → map → analyze → [testid-inject] → plan → generate → validate → [e2e-run] → [bug-detective] → deliver
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
**Produces:** SCAN_MANIFEST.md, 8 codebase map documents, QA_ANALYSIS.md, TEST_INVENTORY.md, QA_REPO_BLUEPRINT.md, test files, POMs, fixtures, VALIDATION_REPORT.md, E2E_RUN_REPORT.md, draft PR.
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## Discovery & Analysis
|
|
51
|
+
|
|
52
|
+
### /qa-map
|
|
53
|
+
|
|
54
|
+
**What it does:** Deep-scans the codebase with 4 parallel agents (testability, risk, patterns, existing tests) and produces 8 documents that become the shared context for all other commands. This is the "brain" of the system.
|
|
55
|
+
|
|
56
|
+
**When to use:** Before using `/create-test` on a mature project. Run it once, and every subsequent command benefits from the codebase knowledge.
|
|
57
|
+
|
|
58
|
+
```
|
|
59
|
+
/qa-map
|
|
60
|
+
/qa-map --focus testability
|
|
61
|
+
/qa-map --focus risk
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**Produces:** TESTABILITY.md, TEST_SURFACE.md, RISK_MAP.md, CRITICAL_PATHS.md, CODE_PATTERNS.md, API_CONTRACTS.md, TEST_ASSESSMENT.md, COVERAGE_GAPS.md — all in `.qa-output/codebase/`.
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
### /qa-analyze
|
|
69
|
+
|
|
70
|
+
**What it does:** Scans and analyzes a repository without generating any tests. Produces assessment documents only — architecture overview, risk areas, test inventory with every test case specified, and a testing pyramid recommendation.
|
|
71
|
+
|
|
72
|
+
**When to use:** You want to understand what needs testing before committing to generation. Good for presenting a plan to the team before executing.
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
/qa-analyze
|
|
76
|
+
/qa-analyze --dev-repo /path/to/project
|
|
77
|
+
/qa-analyze --dev-repo /path/to/project --qa-repo /path/to/tests
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
**Produces:** SCAN_MANIFEST.md, QA_ANALYSIS.md, TEST_INVENTORY.md, and either QA_REPO_BLUEPRINT.md (no QA repo) or GAP_ANALYSIS.md (QA repo provided).
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
### /qa-research
|
|
85
|
+
|
|
86
|
+
**What it does:** Researches the best testing stack, frameworks, and patterns for a specific project. Checks official docs, community best practices, and produces opinionated recommendations with confidence levels.
|
|
87
|
+
|
|
88
|
+
**When to use:** You're setting up testing for the first time and want to know which framework to use, or you're evaluating whether to switch from Jest to Vitest.
|
|
89
|
+
|
|
90
|
+
```
|
|
91
|
+
/qa-research
|
|
92
|
+
/qa-research --focus api-testing
|
|
93
|
+
/qa-research --focus e2e-strategy
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
**Produces:** TESTING_STACK.md, FRAMEWORK_CAPABILITIES.md, API_TESTING_STRATEGY.md, E2E_STRATEGY.md (depending on mode).
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
### /qa-gap
|
|
101
|
+
|
|
102
|
+
**What it does:** Compares a dev repo against its QA repo to find coverage gaps. Shows what's tested, what's missing, what's broken, and what's low quality.
|
|
103
|
+
|
|
104
|
+
**When to use:** You already have tests but suspect there are holes. Want a concrete list of what to add next.
|
|
105
|
+
|
|
106
|
+
```
|
|
107
|
+
/qa-gap --dev-repo /path/to/project --qa-repo /path/to/tests
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
**Produces:** GAP_ANALYSIS.md with coverage map, missing tests, broken tests, and prioritized recommendations.
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
### /qa-pyramid
|
|
115
|
+
|
|
116
|
+
**What it does:** Analyzes the distribution of your test suite against the ideal testing pyramid (60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E). Shows where you're heavy and where you're light.
|
|
117
|
+
|
|
118
|
+
**When to use:** You want to know if your test suite is balanced or if you're over-invested in E2E and under-invested in unit tests.
|
|
119
|
+
|
|
120
|
+
```
|
|
121
|
+
/qa-pyramid ./tests
|
|
122
|
+
/qa-pyramid ./tests --dev-repo /path/to/project
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
**Produces:** PYRAMID_ANALYSIS.md with current vs target distribution and an action plan.
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
## Test Creation
|
|
130
|
+
|
|
131
|
+
### /create-test
|
|
132
|
+
|
|
133
|
+
**What it does:** Generates tests for a specific feature or module. Reads the codebase map (if available) to understand the project's code patterns, API contracts, and existing conventions. If E2E tests are generated and an app is running, opens a real browser, captures actual locators from the page, fixes any mismatches, and loops until tests pass.
|
|
134
|
+
|
|
135
|
+
**When to use:** Daily work. A new feature shipped and you need tests for it.
|
|
136
|
+
|
|
137
|
+
```
|
|
138
|
+
/create-test login
|
|
139
|
+
/create-test "checkout flow"
|
|
140
|
+
/create-test "user API" --dev-repo /path/to/project
|
|
141
|
+
/create-test login --app-url http://localhost:3000
|
|
142
|
+
/create-test login --skip-run
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
**Produces:** Test spec files, POMs, fixtures. E2E_RUN_REPORT.md if tests were run against the app.
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
### /qa-from-ticket
|
|
150
|
+
|
|
151
|
+
**What it does:** Fetches a ticket from Jira, Linear, GitHub Issues, or a file. Extracts acceptance criteria and edge cases. Maps each criterion to test cases with a traceability matrix. Generates test files and validates them.
|
|
152
|
+
|
|
153
|
+
**When to use:** You work from tickets and want every acceptance criterion to have a corresponding test before the feature ships.
|
|
154
|
+
|
|
155
|
+
```
|
|
156
|
+
/qa-from-ticket https://company.atlassian.net/browse/PROJ-123
|
|
157
|
+
/qa-from-ticket #456
|
|
158
|
+
/qa-from-ticket ./tickets/feature-spec.md
|
|
159
|
+
/qa-from-ticket "As a user I want to reset my password via email"
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
**Produces:** TEST_CASES_FROM_TICKET.md (traceability matrix), GENERATION_PLAN_TICKET.md, test files, VALIDATION_REPORT.md.
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
### /qa-pom
|
|
167
|
+
|
|
168
|
+
**What it does:** Generates Page Object Model files for given pages. Follows CLAUDE.md POM rules: one class per page, no assertions, locators as properties, extends shared base.
|
|
169
|
+
|
|
170
|
+
**When to use:** You have pages that need POMs but don't want to generate full test suites yet.
|
|
171
|
+
|
|
172
|
+
```
|
|
173
|
+
/qa-pom ./src/pages
|
|
174
|
+
/qa-pom ./src/pages --framework playwright
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
**Produces:** POM files in `pages/` directory with BasePage and feature pages.
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
### /qa-testid
|
|
182
|
+
|
|
183
|
+
**What it does:** Scans frontend source code, audits interactive elements for missing `data-testid` attributes, and injects them following the naming convention (`{context}-{description}-{element-type}` in kebab-case). Creates a separate branch for the changes.
|
|
184
|
+
|
|
185
|
+
**When to use:** Before writing E2E tests. You want every button, input, and link to have a stable test ID.
|
|
186
|
+
|
|
187
|
+
```
|
|
188
|
+
/qa-testid ./src/components
|
|
189
|
+
/qa-testid ./src
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
**Produces:** TESTID_AUDIT_REPORT.md (coverage score, proposed values) and modified source files with `data-testid` attributes.
|
|
193
|
+
|
|
194
|
+
---
|
|
195
|
+
|
|
196
|
+
## Validation & Fixing
|
|
197
|
+
|
|
198
|
+
### /qa-validate
|
|
199
|
+
|
|
200
|
+
**What it does:** Validates test files against CLAUDE.md standards. Runs 4 static layers (syntax, structure, dependencies, logic) with auto-fix up to 3 loops. Optionally runs E2E tests against a live app to verify locators and assertions with real page data.
|
|
201
|
+
|
|
202
|
+
**When to use:** After writing or generating tests, before shipping. Or as a quality check on an existing test suite.
|
|
203
|
+
|
|
204
|
+
```
|
|
205
|
+
/qa-validate ./tests
|
|
206
|
+
/qa-validate ./tests --classify
|
|
207
|
+
/qa-validate ./tests --run --app-url http://localhost:3000
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
**Produces:** VALIDATION_REPORT.md. FAILURE_CLASSIFICATION_REPORT.md if `--classify` used. E2E_RUN_REPORT.md if `--run` used.
|
|
211
|
+
|
|
212
|
+
---
|
|
213
|
+
|
|
214
|
+
### /qa-fix
|
|
215
|
+
|
|
216
|
+
**What it does:** Diagnoses and fixes broken test files. Reads failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes test code errors.
|
|
217
|
+
|
|
218
|
+
**When to use:** Tests that were passing started failing after a deploy or code change.
|
|
219
|
+
|
|
220
|
+
```
|
|
221
|
+
/qa-fix ./tests/e2e/checkout*
|
|
222
|
+
/qa-fix ./tests --error "timeout waiting for selector"
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
**Produces:** Fixed test files. FAILURE_CLASSIFICATION_REPORT.md with evidence for each failure.
|
|
226
|
+
|
|
227
|
+
---
|
|
228
|
+
|
|
229
|
+
### /update-test
|
|
230
|
+
|
|
231
|
+
**What it does:** Improves existing tests without rewriting them. Upgrades locator tiers, makes assertions more specific, fixes naming conventions, adds missing test IDs.
|
|
232
|
+
|
|
233
|
+
**When to use:** Your tests work but don't follow current standards. You want incremental improvement, not a rewrite.
|
|
234
|
+
|
|
235
|
+
```
|
|
236
|
+
/update-test ./tests
|
|
237
|
+
/update-test ./tests --scope locators
|
|
238
|
+
/update-test ./tests --scope assertions
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
**Produces:** Updated test files with improvements applied.
|
|
242
|
+
|
|
243
|
+
---
|
|
244
|
+
|
|
245
|
+
## Reporting & Auditing
|
|
246
|
+
|
|
247
|
+
### /qa-audit
|
|
248
|
+
|
|
249
|
+
**What it does:** Full 6-dimension quality audit of a test suite. Scores: Locator Quality (20%), Assertion Specificity (20%), POM Compliance (15%), Test Coverage (20%), Naming Convention (15%), Test Data Management (10%). Gives a weighted overall score.
|
|
250
|
+
|
|
251
|
+
**When to use:** You want a quality score for your test suite with concrete recommendations for improvement.
|
|
252
|
+
|
|
253
|
+
```
|
|
254
|
+
/qa-audit ./tests
|
|
255
|
+
/qa-audit ./tests --dev-repo /path/to/project
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
**Produces:** QA_AUDIT_REPORT.md with scores, critical issues, and effort-estimated recommendations (S/M/L).
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
### /qa-report
|
|
263
|
+
|
|
264
|
+
**What it does:** Generates a QA status report adapted to the audience. Team level gets file-level details. Management gets high-level metrics. Client gets a coverage summary.
|
|
265
|
+
|
|
266
|
+
**When to use:** Standups, sprint reviews, client updates.
|
|
267
|
+
|
|
268
|
+
```
|
|
269
|
+
/qa-report ./tests
|
|
270
|
+
/qa-report ./tests --audience management
|
|
271
|
+
/qa-report ./tests --audience client
|
|
272
|
+
```
|
|
273
|
+
|
|
274
|
+
**Produces:** QA_STATUS_REPORT.md with metrics, pyramid distribution, risk areas, recommendations.
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
### /qa-blueprint
|
|
279
|
+
|
|
280
|
+
**What it does:** Generates a complete QA repository structure blueprint — recommended stack, folder layout, config files, CI/CD pipeline, npm scripts, and definition of done.
|
|
281
|
+
|
|
282
|
+
**When to use:** You're starting a QA repo from scratch and want a solid structure before writing the first test.
|
|
283
|
+
|
|
284
|
+
```
|
|
285
|
+
/qa-blueprint
|
|
286
|
+
/qa-blueprint --dev-repo /path/to/project
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
**Produces:** SCAN_MANIFEST.md, QA_REPO_BLUEPRINT.md.
|
|
290
|
+
|
|
291
|
+
---
|
|
292
|
+
|
|
293
|
+
## Delivery
|
|
294
|
+
|
|
295
|
+
### /qa-pr
|
|
296
|
+
|
|
297
|
+
**What it does:** Creates a draft PR from QA artifacts already on disk. Auto-detects git platform (GitHub, Azure DevOps, GitLab). Applies your team's branch naming convention (asks once, remembers forever). Shows what will be committed and waits for your confirmation before pushing.
|
|
298
|
+
|
|
299
|
+
**When to use:** After generating or fixing tests, you want to package them as a PR.
|
|
300
|
+
|
|
301
|
+
```
|
|
302
|
+
/qa-pr --ticket PROJ-123 --title "login tests"
|
|
303
|
+
/qa-pr --scope e2e
|
|
304
|
+
/qa-pr --ticket PROJ-456 --title "checkout" --base develop
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
**Produces:** Git branch (following your convention), commits, draft PR with summary. Returns the PR URL.
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## Common Flows
|
|
312
|
+
|
|
313
|
+
### New project, no tests
|
|
314
|
+
```
|
|
315
|
+
/qa-start --dev-repo ./myproject --auto
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
### Mature project, new feature
|
|
319
|
+
```
|
|
320
|
+
/qa-map # once
|
|
321
|
+
/create-test "password reset" # generates + runs
|
|
322
|
+
/qa-pr --ticket PROJ-123 "reset tests" # ships as PR
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
### From a Jira ticket
|
|
326
|
+
```
|
|
327
|
+
/qa-from-ticket https://jira.company.com/browse/PROJ-456
|
|
328
|
+
/qa-pr --ticket PROJ-456 "login flow"
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
### Fix broken tests after deploy
|
|
332
|
+
```
|
|
333
|
+
/qa-fix ./tests/e2e/checkout*
|
|
334
|
+
/qa-pr --ticket PROJ-789 "fix checkout tests"
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
### Quality check before release
|
|
338
|
+
```
|
|
339
|
+
/qa-audit ./tests --dev-repo ./myproject
|
|
340
|
+
/qa-report ./tests --audience management
|
|
341
|
+
```
|
package/docs/DEMO.md
ADDED
|
@@ -0,0 +1,182 @@
|
|
|
1
|
+
# QAA — QA Automation Agent
|
|
2
|
+
|
|
3
|
+
## What is it?
|
|
4
|
+
|
|
5
|
+
QAA is a multi-agent system that automates QA test creation for any software project. You point it at a codebase, and it analyzes the architecture, maps the code, generates a full test suite following industry standards, validates everything, and delivers the result as a draft pull request — ready for review.
|
|
6
|
+
|
|
7
|
+
No manual test writing. No guessing what to cover. One command, full pipeline.
|
|
8
|
+
|
|
9
|
+
## The Problem
|
|
10
|
+
|
|
11
|
+
Writing test suites is slow, repetitive, and often inconsistent. Teams face:
|
|
12
|
+
|
|
13
|
+
- **Starting from zero is painful** — a new project with no tests means weeks of setup before the first real test runs
|
|
14
|
+
- **Coverage gaps are invisible** — without analysis, teams don't know what's missing until something breaks in production
|
|
15
|
+
- **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming conventions
|
|
16
|
+
- **QA is always behind dev** — features ship faster than tests get written, and the gap keeps growing
|
|
17
|
+
- **Existing QA teams still spend hours on repetitive work** — even with a mature test suite, adding tests for new features means manually inspecting pages, finding locators, writing POMs, running tests, fixing failures, repeat
|
|
18
|
+
|
|
19
|
+
## The Solution
|
|
20
|
+
|
|
21
|
+
QAA runs a pipeline of specialized AI agents, each responsible for one stage:
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
scan → map → analyze → plan → generate → validate → deliver
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
| Stage | What happens | Output |
|
|
28
|
+
|-------|-------------|--------|
|
|
29
|
+
| **Scan** | Detects framework, language, testable surfaces | SCAN_MANIFEST.md |
|
|
30
|
+
| **Map** | Deep-scans codebase for testability, risk, patterns, existing tests (4 parallel agents) | 8 codebase documents |
|
|
31
|
+
| **Analyze** | Produces risk assessment, test inventory, testing pyramid | QA_ANALYSIS.md, TEST_INVENTORY.md |
|
|
32
|
+
| **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | GENERATION_PLAN.md |
|
|
33
|
+
| **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
|
|
34
|
+
| **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | VALIDATION_REPORT.md |
|
|
35
|
+
| **Deliver** | Creates branch, commits per stage, pushes, opens draft PR | Pull request URL |
|
|
36
|
+
|
|
37
|
+
Every agent reads the project's QA standards (CLAUDE.md) before producing output. Every test case has a unique ID, concrete inputs, and explicit expected outcomes — never "works correctly."
|
|
38
|
+
|
|
39
|
+
## Three Workflows
|
|
40
|
+
|
|
41
|
+
QAA adapts to where the project is in its QA maturity:
|
|
42
|
+
|
|
43
|
+
**1. No QA repo yet** — `/qa-start --dev-repo ./myproject`
|
|
44
|
+
Full pipeline from scratch. Produces a complete test suite, QA repo blueprint, and a draft PR with everything.
|
|
45
|
+
|
|
46
|
+
**2. Immature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
|
|
47
|
+
Scans both repos, identifies gaps, fixes broken tests, adds missing coverage, standardizes existing tests.
|
|
48
|
+
|
|
49
|
+
**3. Mature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
|
|
50
|
+
Only adds surgical test additions where coverage is thin. Doesn't touch working tests.
|
|
51
|
+
|
|
52
|
+
## The "Brain" — Codebase Map
|
|
53
|
+
|
|
54
|
+
Before generating anything, QAA maps the entire codebase with 4 parallel agents:
|
|
55
|
+
|
|
56
|
+
- **Testability** — what's testable, pure functions vs stateful code, mock boundaries
|
|
57
|
+
- **Risk** — business-critical paths, security-sensitive areas, data integrity risks
|
|
58
|
+
- **Patterns** — naming conventions, API shapes, import style, code patterns
|
|
59
|
+
- **Existing tests** — current test quality, frameworks in use, coverage gaps
|
|
60
|
+
|
|
61
|
+
These 8 documents become the shared context that every downstream agent reads. The analyzer uses risk data to prioritize tests. The planner uses testability data to estimate complexity. The executor uses code patterns to generate tests that match the project's style.
|
|
62
|
+
|
|
63
|
+
Result: generated tests feel native to the codebase, not generic boilerplate.
|
|
64
|
+
|
|
65
|
+
## Day-to-Day for a QA Engineer
|
|
66
|
+
|
|
67
|
+
This is where QAA shines for teams that already have a mature QA repo. The full pipeline is for bootstrapping — but the real daily value is the targeted workflow.
|
|
68
|
+
|
|
69
|
+
### The scenario
|
|
70
|
+
|
|
71
|
+
You're a QA engineer. A developer just shipped a new "password reset" feature. You need tests. Here's what happens:
|
|
72
|
+
|
|
73
|
+
### Step 1: Map the codebase (once)
|
|
74
|
+
|
|
75
|
+
```
|
|
76
|
+
/qa-map
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
QAA scans the entire project and builds its "brain" — 8 documents covering testability, risk areas, API contracts, code patterns, and existing test coverage. This runs once and stays valid until the codebase changes significantly.
|
|
80
|
+
|
|
81
|
+
### Step 2: Create tests for the feature
|
|
82
|
+
|
|
83
|
+
```
|
|
84
|
+
/create-test "password reset"
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
QAA already knows the codebase. It reads the brain documents, finds the relevant source files (`auth.service.ts`, `reset.controller.ts`, the reset page component), understands the API contracts, and generates:
|
|
88
|
+
|
|
89
|
+
- Unit tests for the reset token logic with concrete inputs and expected outputs
|
|
90
|
+
- API tests for `POST /api/auth/reset-password` with real request/response shapes
|
|
91
|
+
- E2E tests with Page Object Models that use the project's existing POM base class
|
|
92
|
+
- Fixtures with test data (fake emails, expired tokens, invalid tokens)
|
|
93
|
+
|
|
94
|
+
All following the project's naming conventions, import style, and assertion patterns.
|
|
95
|
+
|
|
96
|
+
### Step 3: Validate and fix in a loop
|
|
97
|
+
|
|
98
|
+
```
|
|
99
|
+
/qa-validate ./tests
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
The validator runs 4 layers of checks on every generated file:
|
|
103
|
+
|
|
104
|
+
1. **Syntax** — does it parse? Are imports correct?
|
|
105
|
+
2. **Structure** — does it follow POM rules? Are locators in the right tier?
|
|
106
|
+
3. **Dependencies** — do all imports resolve? Are mocks set up correctly?
|
|
107
|
+
4. **Logic** — are assertions concrete? Do test IDs follow the convention?
|
|
108
|
+
|
|
109
|
+
If issues are found, the validator auto-fixes them and re-checks — up to 3 loops. If something still fails, the bug detective classifies it: is it an application bug, a test code error, or an environment issue?
|
|
110
|
+
|
|
111
|
+
### Step 4: Run the tests with Playwright
|
|
112
|
+
|
|
113
|
+
QAA integrates with Playwright to actually execute the generated E2E tests against a running application. It opens the browser, navigates pages, fills forms, clicks buttons, and captures what happens. If a test fails, it reads the error, inspects the page state, and determines whether the locator is wrong, the page changed, or there's a real bug.
|
|
114
|
+
|
|
115
|
+
The loop looks like this:
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
generate → validate → run → failures? → classify → fix test code → run again → pass
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
This continues until the tests pass or the issue is classified as an application bug that needs a developer fix.
|
|
122
|
+
|
|
123
|
+
### Step 5: Ship it
|
|
124
|
+
|
|
125
|
+
```
|
|
126
|
+
/qa-pr --ticket PROJ-456 "password reset tests"
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
QAA creates a branch following your team's naming convention (it asked you once and remembers forever), commits the test files, pushes, and opens a draft PR on GitHub, Azure DevOps, or GitLab — whatever your team uses. You get the link.
|
|
130
|
+
|
|
131
|
+
### The full daily flow
|
|
132
|
+
|
|
133
|
+
```
|
|
134
|
+
/qa-map → builds the "brain" (once)
|
|
135
|
+
/create-test "password reset" → generates tests using codebase knowledge
|
|
136
|
+
/qa-validate ./tests/unit/auth* → validates + auto-fixes
|
|
137
|
+
/qa-pr --ticket PROJ-456 "password reset tests" → draft PR with link
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
From ticket to PR in minutes, not hours. And the tests follow the same standards as every other test in the repo because QAA read the existing patterns first.
|
|
141
|
+
|
|
142
|
+
### What about tickets?
|
|
143
|
+
|
|
144
|
+
If you work from Jira, Linear, or GitHub Issues, skip the manual description:
|
|
145
|
+
|
|
146
|
+
```
|
|
147
|
+
/qa-from-ticket https://company.atlassian.net/browse/PROJ-456
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
QAA fetches the ticket, extracts acceptance criteria and edge cases, maps each criterion to test cases with a traceability matrix, generates the tests, validates them, and gives you a report showing which AC is covered by which test.
|
|
151
|
+
|
|
152
|
+
### When tests break after a deploy
|
|
153
|
+
|
|
154
|
+
```
|
|
155
|
+
/qa-fix ./tests/e2e/checkout*
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
QAA reads the failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes the test code errors. Application bugs get flagged for the dev team with evidence — the exact assertion that failed, what was expected, and what was received.
|
|
159
|
+
|
|
160
|
+
## Standards
|
|
161
|
+
|
|
162
|
+
Every test artifact follows strict rules:
|
|
163
|
+
|
|
164
|
+
- **Testing pyramid** — 60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E
|
|
165
|
+
- **Locator hierarchy** — data-testid first, ARIA roles, labels, CSS as last resort (with TODO)
|
|
166
|
+
- **Page Object Model** — one class per page, no assertions in POMs, locators as properties
|
|
167
|
+
- **Assertions** — concrete values only. `expect(status).toBe(200)` not `expect(status).toBeTruthy()`
|
|
168
|
+
- **Naming** — unique IDs per test case: `UT-AUTH-001`, `API-USERS-003`, `E2E-CHECKOUT-001`
|
|
169
|
+
|
|
170
|
+
## Learning System
|
|
171
|
+
|
|
172
|
+
QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with feature/" — it saves the rule permanently. Next time, every agent reads your preferences before generating output.
|
|
173
|
+
|
|
174
|
+
Preferences override defaults. Your team's conventions always win.
|
|
175
|
+
|
|
176
|
+
## Numbers
|
|
177
|
+
|
|
178
|
+
17 commands. 7 skills. 11 agents. 10 templates. 7 workflows.
|
|
179
|
+
|
|
180
|
+
Supports GitHub, Azure DevOps, and GitLab. Works with Playwright, Cypress, Jest, Vitest, pytest, and more — detects what the project uses and matches it.
|
|
181
|
+
|
|
182
|
+
One goal: you focus on building features, QAA handles the tests.
|
package/docs/TESTING.md
ADDED
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
# Testing Guide
|
|
2
|
+
|
|
3
|
+
How to validate the QA Automation Agent against real repositories.
|
|
4
|
+
|
|
5
|
+
## Test Repos
|
|
6
|
+
|
|
7
|
+
Each workflow option requires a different type of repository to test against.
|
|
8
|
+
|
|
9
|
+
### Option 1 — Dev Only (No QA Repo)
|
|
10
|
+
|
|
11
|
+
Test the full generation pipeline: scan → codebase-map → analyze → generate → validate → [e2e-run] → deliver.
|
|
12
|
+
|
|
13
|
+
| Repo | Stack | Why it's good |
|
|
14
|
+
|------|-------|---------------|
|
|
15
|
+
| [devdbrandy/restful-ecommerce](https://github.com/devdbrandy/restful-ecommerce) | Node.js, Express | Minimalist e-commerce API. Products, orders, auth. Almost no tests — agent generates everything from scratch. |
|
|
16
|
+
| [themodernmonk7/E-commerce-API](https://github.com/themodernmonk7/E-commerce-API) | Express, MongoDB | Full CRUD: auth, products, orders, reviews. No QA repo. Good variety of endpoints. |
|
|
17
|
+
| [dinushchathurya/nodejs-ecommerce-api](https://github.com/dinushchathurya/nodejs-ecommerce-api) | Express, MongoDB | Full e-commerce: users, categories, products, orders, images. No tests. |
|
|
18
|
+
|
|
19
|
+
**How to test:**
|
|
20
|
+
```bash
|
|
21
|
+
git clone https://github.com/devdbrandy/restful-ecommerce.git /tmp/test-option1
|
|
22
|
+
cd /tmp/test-option1
|
|
23
|
+
# Open Claude Code in this directory, then:
|
|
24
|
+
/qa-start --dev-repo .
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
**Expected output:**
|
|
28
|
+
- SCAN_MANIFEST.md with Node.js/Express detection
|
|
29
|
+
- QA_ANALYSIS.md with architecture overview and risk assessment
|
|
30
|
+
- TEST_INVENTORY.md with 30+ test cases (pyramid-driven)
|
|
31
|
+
- QA_REPO_BLUEPRINT.md (since no QA repo exists)
|
|
32
|
+
- Generated test files (unit, API, E2E)
|
|
33
|
+
- VALIDATION_REPORT.md
|
|
34
|
+
- Draft PR with all artifacts
|
|
35
|
+
|
|
36
|
+
### Option 2 — Dev + Immature QA Repo
|
|
37
|
+
|
|
38
|
+
Test the gap analysis and augmentation pipeline.
|
|
39
|
+
|
|
40
|
+
| DEV Repo | Stack | QA Repo |
|
|
41
|
+
|----------|-------|---------|
|
|
42
|
+
| [oozdal/to-do-list-api](https://github.com/oozdal/to-do-list-api) | FastAPI, SQLAlchemy | Create a crude QA repo manually (see below) |
|
|
43
|
+
| [KenMwaura1/Fast-Api-example](https://github.com/KenMwaura1/Fast-Api-example) | FastAPI, PostgreSQL, SQLAlchemy | Create a crude QA repo manually (see below) |
|
|
44
|
+
|
|
45
|
+
**Creating a crude QA repo for testing:**
|
|
46
|
+
|
|
47
|
+
No open-source repos have intentionally bad tests. Create one manually:
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
mkdir /tmp/test-option2-qa
|
|
51
|
+
cd /tmp/test-option2-qa
|
|
52
|
+
|
|
53
|
+
# Create a broken test file with hardcoded tokens
|
|
54
|
+
cat > tests/test_api.py << 'EOF'
|
|
55
|
+
import requests
|
|
56
|
+
|
|
57
|
+
# TODO: ask Juan for new token
|
|
58
|
+
TOKEN = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.expired"
|
|
59
|
+
BASE_URL = "http://localhost:8000"
|
|
60
|
+
|
|
61
|
+
def test_get_todos():
|
|
62
|
+
r = requests.get(f"{BASE_URL}/todos", headers={"Authorization": f"Bearer {TOKEN}"})
|
|
63
|
+
assert r.status_code # No specific assertion!
|
|
64
|
+
|
|
65
|
+
def test_create_todo():
|
|
66
|
+
r = requests.post(f"{BASE_URL}/todos", json={"title": "test"})
|
|
67
|
+
# No assertion at all
|
|
68
|
+
EOF
|
|
69
|
+
|
|
70
|
+
# Create a broken Selenium test
|
|
71
|
+
cat > old_tests/test_ui.py << 'EOF'
|
|
72
|
+
from selenium import webdriver
|
|
73
|
+
# This import doesn't work anymore
|
|
74
|
+
from selenium.webdriver.common.keys import Keys
|
|
75
|
+
|
|
76
|
+
def test_login():
|
|
77
|
+
driver = webdriver.Chrome() # Will fail - no ChromeDriver
|
|
78
|
+
driver.get("http://localhost:3000/login")
|
|
79
|
+
driver.find_element_by_class_name("login-btn").click() # Deprecated API
|
|
80
|
+
EOF
|
|
81
|
+
|
|
82
|
+
# Create empty postman collection
|
|
83
|
+
echo '{"info": {"name": "Todo API"}, "item": []}' > postman/collection.json
|
|
84
|
+
|
|
85
|
+
# Bad README
|
|
86
|
+
echo "# QA Tests\nSome tests might need updating. Ask the team." > README.md
|
|
87
|
+
|
|
88
|
+
git init && git add -A && git commit -m "initial qa setup"
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
**How to test:**
|
|
92
|
+
```bash
|
|
93
|
+
cd /tmp/test-option2-dev # The FastAPI dev repo
|
|
94
|
+
/qa-start --dev-repo . --qa-repo /tmp/test-option2-qa
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
**Expected output:**
|
|
98
|
+
- Maturity score < 30 → Option 2 detected
|
|
99
|
+
- GAP_ANALYSIS.md showing broken tests, missing coverage
|
|
100
|
+
- Fixed test files (imports, assertions, configs)
|
|
101
|
+
- New test cases added for uncovered features
|
|
102
|
+
- VALIDATION_REPORT.md
|
|
103
|
+
|
|
104
|
+
### Option 3 — Dev + Mature QA Repo
|
|
105
|
+
|
|
106
|
+
Test the surgical additions pipeline.
|
|
107
|
+
|
|
108
|
+
| QA Repo | Stack | Why it's good |
|
|
109
|
+
|---------|-------|---------------|
|
|
110
|
+
| [OmonUrkinbaev/playwright-qa-automation](https://github.com/OmonUrkinbaev/playwright-qa-automation) | Playwright, TypeScript, POM | Production-style: UI + API tests, POM, data-driven, flaky handling, CI pipelines. |
|
|
111
|
+
| [idavidov13/Playwright-Framework](https://github.com/idavidov13/Playwright-Framework) | Playwright, TypeScript, POM | Custom fixtures, API mocking, Zod validation, GitHub Actions + GitLab CI. |
|
|
112
|
+
| [nareshnavinash/playwright-TS-pom](https://github.com/nareshnavinash/playwright-TS-pom) | Playwright, TypeScript, POM | GitLab CI, DataDog integration, ESLint, junit + HTML reporting. |
|
|
113
|
+
|
|
114
|
+
**How to test:**
|
|
115
|
+
```bash
|
|
116
|
+
# Clone the app being tested (the QA repo tests against a demo app)
|
|
117
|
+
git clone https://github.com/OmonUrkinbaev/playwright-qa-automation.git /tmp/test-option3-qa
|
|
118
|
+
|
|
119
|
+
# You need the DEV repo that the QA repo tests against
|
|
120
|
+
# Check the QA repo's README for the target application URL/repo
|
|
121
|
+
|
|
122
|
+
/qa-start --dev-repo /tmp/test-option3-dev --qa-repo /tmp/test-option3-qa
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
**Expected output:**
|
|
126
|
+
- Maturity score > 70 → Option 3 detected
|
|
127
|
+
- GAP_ANALYSIS.md showing thin coverage areas only
|
|
128
|
+
- Only missing tests added — existing tests untouched
|
|
129
|
+
- Existing POM conventions respected
|
|
130
|
+
- VALIDATION_REPORT.md
|
|
131
|
+
|
|
132
|
+
## Validation Checklist
|
|
133
|
+
|
|
134
|
+
After each test run, verify:
|
|
135
|
+
|
|
136
|
+
- [ ] Correct workflow option detected (1, 2, or 3)
|
|
137
|
+
- [ ] SCAN_MANIFEST.md has correct framework detection
|
|
138
|
+
- [ ] QA_ANALYSIS.md has real architecture info (not generic)
|
|
139
|
+
- [ ] TEST_INVENTORY.md has concrete inputs and expected outcomes
|
|
140
|
+
- [ ] Generated test files follow CLAUDE.md standards
|
|
141
|
+
- [ ] All locators use Tier 1 (data-testid) or Tier 2 (ARIA roles)
|
|
142
|
+
- [ ] No assertions use toBeTruthy/toBeDefined
|
|
143
|
+
- [ ] POM has no assertions (if E2E tests generated)
|
|
144
|
+
- [ ] VALIDATION_REPORT.md shows 4-layer check results
|
|
145
|
+
- [ ] Draft PR created with full summary and reviewer checklist
|
|
146
|
+
- [ ] No hardcoded credentials in any generated file
|
|
147
|
+
|
|
148
|
+
## Troubleshooting Test Runs
|
|
149
|
+
|
|
150
|
+
| Issue | Fix |
|
|
151
|
+
|-------|-----|
|
|
152
|
+
| `gh: command not found` | Install GitHub CLI: `brew install gh` or `winget install GitHub.cli` |
|
|
153
|
+
| `gh auth` error | Run `gh auth login` first |
|
|
154
|
+
| Agent detects wrong framework | Check SCAN_MANIFEST.md — may need to adjust scanner detection patterns |
|
|
155
|
+
| Maturity score seems wrong | Review the 5 scoring dimensions in init.cjs cmdInitQaStart |
|
|
156
|
+
| Tests fail to validate | Check if the test framework is installed in the target repo |
|