qaa-agent 1.3.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,341 @@
1
+ # QAA Commands Reference
2
+
3
+ ## When to Use What
4
+
5
+ ```
6
+ "I have a project with no tests" → /qa-start
7
+ "I want to understand the codebase first" → /qa-map
8
+ "I need tests for a specific feature" → /create-test
9
+ "I have a Jira/Linear ticket to cover" → /qa-from-ticket
10
+ "I want to check if my tests are good" → /qa-validate
11
+ "My tests are failing after a deploy" → /qa-fix
12
+ "I want to ship my tests as a PR" → /qa-pr
13
+ "I want a full report for my manager" → /qa-report
14
+ "I need to know what's missing in our test suite" → /qa-gap
15
+ "I want to add data-testid to components" → /qa-testid
16
+ "I need Page Object Models" → /qa-pom
17
+ "I want to audit test quality" → /qa-audit
18
+ "I need a QA repo structure from scratch" → /qa-blueprint
19
+ "I want to see our testing pyramid balance" → /qa-pyramid
20
+ "I want to analyze without generating anything" → /qa-analyze
21
+ "I want to research best testing practices" → /qa-research
22
+ "I want to improve existing tests" → /update-test
23
+ ```
24
+
25
+ ---
26
+
27
+ ## Full Pipeline
28
+
29
+ ### /qa-start
30
+
31
+ **What it does:** Runs the entire QA automation pipeline end-to-end. Scans the repo, maps the codebase, analyzes architecture, plans test cases, generates test files, validates them, runs E2E tests against the live app, and delivers everything as a draft PR.
32
+
33
+ **When to use:** Starting QA from scratch on a project, or doing a full gap-fill on an existing QA repo.
34
+
35
+ ```
36
+ /qa-start --dev-repo /path/to/project
37
+ /qa-start --dev-repo /path/to/project --qa-repo /path/to/tests
38
+ /qa-start --dev-repo /path/to/project --auto
39
+ ```
40
+
41
+ **Pipeline:**
42
+ ```
43
+ scan → map → analyze → [testid-inject] → plan → generate → validate → [e2e-run] → [bug-detective] → deliver
44
+ ```
45
+
46
+ **Produces:** SCAN_MANIFEST.md, 8 codebase map documents, QA_ANALYSIS.md, TEST_INVENTORY.md, QA_REPO_BLUEPRINT.md, test files, POMs, fixtures, VALIDATION_REPORT.md, E2E_RUN_REPORT.md, draft PR.
47
+
48
+ ---
49
+
50
+ ## Discovery & Analysis
51
+
52
+ ### /qa-map
53
+
54
+ **What it does:** Deep-scans the codebase with 4 parallel agents (testability, risk, patterns, existing tests) and produces 8 documents that become the shared context for all other commands. This is the "brain" of the system.
55
+
56
+ **When to use:** Before using `/create-test` on a mature project. Run it once, and every subsequent command benefits from the codebase knowledge.
57
+
58
+ ```
59
+ /qa-map
60
+ /qa-map --focus testability
61
+ /qa-map --focus risk
62
+ ```
63
+
64
+ **Produces:** TESTABILITY.md, TEST_SURFACE.md, RISK_MAP.md, CRITICAL_PATHS.md, CODE_PATTERNS.md, API_CONTRACTS.md, TEST_ASSESSMENT.md, COVERAGE_GAPS.md — all in `.qa-output/codebase/`.
65
+
66
+ ---
67
+
68
+ ### /qa-analyze
69
+
70
+ **What it does:** Scans and analyzes a repository without generating any tests. Produces assessment documents only — architecture overview, risk areas, test inventory with every test case specified, and a testing pyramid recommendation.
71
+
72
+ **When to use:** You want to understand what needs testing before committing to generation. Good for presenting a plan to the team before executing.
73
+
74
+ ```
75
+ /qa-analyze
76
+ /qa-analyze --dev-repo /path/to/project
77
+ /qa-analyze --dev-repo /path/to/project --qa-repo /path/to/tests
78
+ ```
79
+
80
+ **Produces:** SCAN_MANIFEST.md, QA_ANALYSIS.md, TEST_INVENTORY.md, and either QA_REPO_BLUEPRINT.md (no QA repo) or GAP_ANALYSIS.md (QA repo provided).
81
+
82
+ ---
83
+
84
+ ### /qa-research
85
+
86
+ **What it does:** Researches the best testing stack, frameworks, and patterns for a specific project. Checks official docs, community best practices, and produces opinionated recommendations with confidence levels.
87
+
88
+ **When to use:** You're setting up testing for the first time and want to know which framework to use, or you're evaluating whether to switch from Jest to Vitest.
89
+
90
+ ```
91
+ /qa-research
92
+ /qa-research --focus api-testing
93
+ /qa-research --focus e2e-strategy
94
+ ```
95
+
96
+ **Produces:** TESTING_STACK.md, FRAMEWORK_CAPABILITIES.md, API_TESTING_STRATEGY.md, E2E_STRATEGY.md (depending on mode).
97
+
98
+ ---
99
+
100
+ ### /qa-gap
101
+
102
+ **What it does:** Compares a dev repo against its QA repo to find coverage gaps. Shows what's tested, what's missing, what's broken, and what's low quality.
103
+
104
+ **When to use:** You already have tests but suspect there are holes. Want a concrete list of what to add next.
105
+
106
+ ```
107
+ /qa-gap --dev-repo /path/to/project --qa-repo /path/to/tests
108
+ ```
109
+
110
+ **Produces:** GAP_ANALYSIS.md with coverage map, missing tests, broken tests, and prioritized recommendations.
111
+
112
+ ---
113
+
114
+ ### /qa-pyramid
115
+
116
+ **What it does:** Analyzes the distribution of your test suite against the ideal testing pyramid (60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E). Shows where you're heavy and where you're light.
117
+
118
+ **When to use:** You want to know if your test suite is balanced or if you're over-invested in E2E and under-invested in unit tests.
119
+
120
+ ```
121
+ /qa-pyramid ./tests
122
+ /qa-pyramid ./tests --dev-repo /path/to/project
123
+ ```
124
+
125
+ **Produces:** PYRAMID_ANALYSIS.md with current vs target distribution and an action plan.
126
+
127
+ ---
128
+
129
+ ## Test Creation
130
+
131
+ ### /create-test
132
+
133
+ **What it does:** Generates tests for a specific feature or module. Reads the codebase map (if available) to understand the project's code patterns, API contracts, and existing conventions. If E2E tests are generated and an app is running, opens a real browser, captures actual locators from the page, fixes any mismatches, and loops until tests pass.
134
+
135
+ **When to use:** Daily work. A new feature shipped and you need tests for it.
136
+
137
+ ```
138
+ /create-test login
139
+ /create-test "checkout flow"
140
+ /create-test "user API" --dev-repo /path/to/project
141
+ /create-test login --app-url http://localhost:3000
142
+ /create-test login --skip-run
143
+ ```
144
+
145
+ **Produces:** Test spec files, POMs, fixtures. E2E_RUN_REPORT.md if tests were run against the app.
146
+
147
+ ---
148
+
149
+ ### /qa-from-ticket
150
+
151
+ **What it does:** Fetches a ticket from Jira, Linear, GitHub Issues, or a file. Extracts acceptance criteria and edge cases. Maps each criterion to test cases with a traceability matrix. Generates test files and validates them.
152
+
153
+ **When to use:** You work from tickets and want every acceptance criterion to have a corresponding test before the feature ships.
154
+
155
+ ```
156
+ /qa-from-ticket https://company.atlassian.net/browse/PROJ-123
157
+ /qa-from-ticket #456
158
+ /qa-from-ticket ./tickets/feature-spec.md
159
+ /qa-from-ticket "As a user I want to reset my password via email"
160
+ ```
161
+
162
+ **Produces:** TEST_CASES_FROM_TICKET.md (traceability matrix), GENERATION_PLAN_TICKET.md, test files, VALIDATION_REPORT.md.
163
+
164
+ ---
165
+
166
+ ### /qa-pom
167
+
168
+ **What it does:** Generates Page Object Model files for given pages. Follows CLAUDE.md POM rules: one class per page, no assertions, locators as properties, extends shared base.
169
+
170
+ **When to use:** You have pages that need POMs but don't want to generate full test suites yet.
171
+
172
+ ```
173
+ /qa-pom ./src/pages
174
+ /qa-pom ./src/pages --framework playwright
175
+ ```
176
+
177
+ **Produces:** POM files in `pages/` directory with BasePage and feature pages.
178
+
179
+ ---
180
+
181
+ ### /qa-testid
182
+
183
+ **What it does:** Scans frontend source code, audits interactive elements for missing `data-testid` attributes, and injects them following the naming convention (`{context}-{description}-{element-type}` in kebab-case). Creates a separate branch for the changes.
184
+
185
+ **When to use:** Before writing E2E tests. You want every button, input, and link to have a stable test ID.
186
+
187
+ ```
188
+ /qa-testid ./src/components
189
+ /qa-testid ./src
190
+ ```
191
+
192
+ **Produces:** TESTID_AUDIT_REPORT.md (coverage score, proposed values) and modified source files with `data-testid` attributes.
193
+
194
+ ---
195
+
196
+ ## Validation & Fixing
197
+
198
+ ### /qa-validate
199
+
200
+ **What it does:** Validates test files against CLAUDE.md standards. Runs 4 static layers (syntax, structure, dependencies, logic) with auto-fix up to 3 loops. Optionally runs E2E tests against a live app to verify locators and assertions with real page data.
201
+
202
+ **When to use:** After writing or generating tests, before shipping. Or as a quality check on an existing test suite.
203
+
204
+ ```
205
+ /qa-validate ./tests
206
+ /qa-validate ./tests --classify
207
+ /qa-validate ./tests --run --app-url http://localhost:3000
208
+ ```
209
+
210
+ **Produces:** VALIDATION_REPORT.md. FAILURE_CLASSIFICATION_REPORT.md if `--classify` used. E2E_RUN_REPORT.md if `--run` used.
211
+
212
+ ---
213
+
214
+ ### /qa-fix
215
+
216
+ **What it does:** Diagnoses and fixes broken test files. Reads failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes test code errors.
217
+
218
+ **When to use:** Tests that were passing started failing after a deploy or code change.
219
+
220
+ ```
221
+ /qa-fix ./tests/e2e/checkout*
222
+ /qa-fix ./tests --error "timeout waiting for selector"
223
+ ```
224
+
225
+ **Produces:** Fixed test files. FAILURE_CLASSIFICATION_REPORT.md with evidence for each failure.
226
+
227
+ ---
228
+
229
+ ### /update-test
230
+
231
+ **What it does:** Improves existing tests without rewriting them. Upgrades locator tiers, makes assertions more specific, fixes naming conventions, adds missing test IDs.
232
+
233
+ **When to use:** Your tests work but don't follow current standards. You want incremental improvement, not a rewrite.
234
+
235
+ ```
236
+ /update-test ./tests
237
+ /update-test ./tests --scope locators
238
+ /update-test ./tests --scope assertions
239
+ ```
240
+
241
+ **Produces:** Updated test files with improvements applied.
242
+
243
+ ---
244
+
245
+ ## Reporting & Auditing
246
+
247
+ ### /qa-audit
248
+
249
+ **What it does:** Full 6-dimension quality audit of a test suite. Scores: Locator Quality (20%), Assertion Specificity (20%), POM Compliance (15%), Test Coverage (20%), Naming Convention (15%), Test Data Management (10%). Gives a weighted overall score.
250
+
251
+ **When to use:** You want a quality score for your test suite with concrete recommendations for improvement.
252
+
253
+ ```
254
+ /qa-audit ./tests
255
+ /qa-audit ./tests --dev-repo /path/to/project
256
+ ```
257
+
258
+ **Produces:** QA_AUDIT_REPORT.md with scores, critical issues, and effort-estimated recommendations (S/M/L).
259
+
260
+ ---
261
+
262
+ ### /qa-report
263
+
264
+ **What it does:** Generates a QA status report adapted to the audience. Team level gets file-level details. Management gets high-level metrics. Client gets a coverage summary.
265
+
266
+ **When to use:** Standups, sprint reviews, client updates.
267
+
268
+ ```
269
+ /qa-report ./tests
270
+ /qa-report ./tests --audience management
271
+ /qa-report ./tests --audience client
272
+ ```
273
+
274
+ **Produces:** QA_STATUS_REPORT.md with metrics, pyramid distribution, risk areas, recommendations.
275
+
276
+ ---
277
+
278
+ ### /qa-blueprint
279
+
280
+ **What it does:** Generates a complete QA repository structure blueprint — recommended stack, folder layout, config files, CI/CD pipeline, npm scripts, and definition of done.
281
+
282
+ **When to use:** You're starting a QA repo from scratch and want a solid structure before writing the first test.
283
+
284
+ ```
285
+ /qa-blueprint
286
+ /qa-blueprint --dev-repo /path/to/project
287
+ ```
288
+
289
+ **Produces:** SCAN_MANIFEST.md, QA_REPO_BLUEPRINT.md.
290
+
291
+ ---
292
+
293
+ ## Delivery
294
+
295
+ ### /qa-pr
296
+
297
+ **What it does:** Creates a draft PR from QA artifacts already on disk. Auto-detects git platform (GitHub, Azure DevOps, GitLab). Applies your team's branch naming convention (asks once, remembers forever). Shows what will be committed and waits for your confirmation before pushing.
298
+
299
+ **When to use:** After generating or fixing tests, you want to package them as a PR.
300
+
301
+ ```
302
+ /qa-pr --ticket PROJ-123 --title "login tests"
303
+ /qa-pr --scope e2e
304
+ /qa-pr --ticket PROJ-456 --title "checkout" --base develop
305
+ ```
306
+
307
+ **Produces:** Git branch (following your convention), commits, draft PR with summary. Returns the PR URL.
308
+
309
+ ---
310
+
311
+ ## Common Flows
312
+
313
+ ### New project, no tests
314
+ ```
315
+ /qa-start --dev-repo ./myproject --auto
316
+ ```
317
+
318
+ ### Mature project, new feature
319
+ ```
320
+ /qa-map # once
321
+ /create-test "password reset" # generates + runs
322
+ /qa-pr --ticket PROJ-123 "reset tests" # ships as PR
323
+ ```
324
+
325
+ ### From a Jira ticket
326
+ ```
327
+ /qa-from-ticket https://jira.company.com/browse/PROJ-456
328
+ /qa-pr --ticket PROJ-456 "login flow"
329
+ ```
330
+
331
+ ### Fix broken tests after deploy
332
+ ```
333
+ /qa-fix ./tests/e2e/checkout*
334
+ /qa-pr --ticket PROJ-789 "fix checkout tests"
335
+ ```
336
+
337
+ ### Quality check before release
338
+ ```
339
+ /qa-audit ./tests --dev-repo ./myproject
340
+ /qa-report ./tests --audience management
341
+ ```
package/docs/DEMO.md ADDED
@@ -0,0 +1,182 @@
1
+ # QAA — QA Automation Agent
2
+
3
+ ## What is it?
4
+
5
+ QAA is a multi-agent system that automates QA test creation for any software project. You point it at a codebase, and it analyzes the architecture, maps the code, generates a full test suite following industry standards, validates everything, and delivers the result as a draft pull request — ready for review.
6
+
7
+ No manual test writing. No guessing what to cover. One command, full pipeline.
8
+
9
+ ## The Problem
10
+
11
+ Writing test suites is slow, repetitive, and often inconsistent. Teams face:
12
+
13
+ - **Starting from zero is painful** — a new project with no tests means weeks of setup before the first real test runs
14
+ - **Coverage gaps are invisible** — without analysis, teams don't know what's missing until something breaks in production
15
+ - **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming conventions
16
+ - **QA is always behind dev** — features ship faster than tests get written, and the gap keeps growing
17
+ - **Existing QA teams still spend hours on repetitive work** — even with a mature test suite, adding tests for new features means manually inspecting pages, finding locators, writing POMs, running tests, fixing failures, repeat
18
+
19
+ ## The Solution
20
+
21
+ QAA runs a pipeline of specialized AI agents, each responsible for one stage:
22
+
23
+ ```
24
+ scan → map → analyze → plan → generate → validate → deliver
25
+ ```
26
+
27
+ | Stage | What happens | Output |
28
+ |-------|-------------|--------|
29
+ | **Scan** | Detects framework, language, testable surfaces | SCAN_MANIFEST.md |
30
+ | **Map** | Deep-scans codebase for testability, risk, patterns, existing tests (4 parallel agents) | 8 codebase documents |
31
+ | **Analyze** | Produces risk assessment, test inventory, testing pyramid | QA_ANALYSIS.md, TEST_INVENTORY.md |
32
+ | **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | GENERATION_PLAN.md |
33
+ | **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
34
+ | **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | VALIDATION_REPORT.md |
35
+ | **Deliver** | Creates branch, commits per stage, pushes, opens draft PR | Pull request URL |
36
+
37
+ Every agent reads the project's QA standards (CLAUDE.md) before producing output. Every test case has a unique ID, concrete inputs, and explicit expected outcomes — never "works correctly."
38
+
39
+ ## Three Workflows
40
+
41
+ QAA adapts to where the project is in its QA maturity:
42
+
43
+ **1. No QA repo yet** — `/qa-start --dev-repo ./myproject`
44
+ Full pipeline from scratch. Produces a complete test suite, QA repo blueprint, and a draft PR with everything.
45
+
46
+ **2. Immature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
47
+ Scans both repos, identifies gaps, fixes broken tests, adds missing coverage, standardizes existing tests.
48
+
49
+ **3. Mature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
50
+ Only adds surgical test additions where coverage is thin. Doesn't touch working tests.
51
+
52
+ ## The "Brain" — Codebase Map
53
+
54
+ Before generating anything, QAA maps the entire codebase with 4 parallel agents:
55
+
56
+ - **Testability** — what's testable, pure functions vs stateful code, mock boundaries
57
+ - **Risk** — business-critical paths, security-sensitive areas, data integrity risks
58
+ - **Patterns** — naming conventions, API shapes, import style, code patterns
59
+ - **Existing tests** — current test quality, frameworks in use, coverage gaps
60
+
61
+ These 8 documents become the shared context that every downstream agent reads. The analyzer uses risk data to prioritize tests. The planner uses testability data to estimate complexity. The executor uses code patterns to generate tests that match the project's style.
62
+
63
+ Result: generated tests feel native to the codebase, not generic boilerplate.
64
+
65
+ ## Day-to-Day for a QA Engineer
66
+
67
+ This is where QAA shines for teams that already have a mature QA repo. The full pipeline is for bootstrapping — but the real daily value is the targeted workflow.
68
+
69
+ ### The scenario
70
+
71
+ You're a QA engineer. A developer just shipped a new "password reset" feature. You need tests. Here's what happens:
72
+
73
+ ### Step 1: Map the codebase (once)
74
+
75
+ ```
76
+ /qa-map
77
+ ```
78
+
79
+ QAA scans the entire project and builds its "brain" — 8 documents covering testability, risk areas, API contracts, code patterns, and existing test coverage. This runs once and stays valid until the codebase changes significantly.
80
+
81
+ ### Step 2: Create tests for the feature
82
+
83
+ ```
84
+ /create-test "password reset"
85
+ ```
86
+
87
+ QAA already knows the codebase. It reads the brain documents, finds the relevant source files (`auth.service.ts`, `reset.controller.ts`, the reset page component), understands the API contracts, and generates:
88
+
89
+ - Unit tests for the reset token logic with concrete inputs and expected outputs
90
+ - API tests for `POST /api/auth/reset-password` with real request/response shapes
91
+ - E2E tests with Page Object Models that use the project's existing POM base class
92
+ - Fixtures with test data (fake emails, expired tokens, invalid tokens)
93
+
94
+ All following the project's naming conventions, import style, and assertion patterns.
95
+
96
+ ### Step 3: Validate and fix in a loop
97
+
98
+ ```
99
+ /qa-validate ./tests
100
+ ```
101
+
102
+ The validator runs 4 layers of checks on every generated file:
103
+
104
+ 1. **Syntax** — does it parse? Are imports correct?
105
+ 2. **Structure** — does it follow POM rules? Are locators in the right tier?
106
+ 3. **Dependencies** — do all imports resolve? Are mocks set up correctly?
107
+ 4. **Logic** — are assertions concrete? Do test IDs follow the convention?
108
+
109
+ If issues are found, the validator auto-fixes them and re-checks — up to 3 loops. If something still fails, the bug detective classifies it: is it an application bug, a test code error, or an environment issue?
110
+
111
+ ### Step 4: Run the tests with Playwright
112
+
113
+ QAA integrates with Playwright to actually execute the generated E2E tests against a running application. It opens the browser, navigates pages, fills forms, clicks buttons, and captures what happens. If a test fails, it reads the error, inspects the page state, and determines whether the locator is wrong, the page changed, or there's a real bug.
114
+
115
+ The loop looks like this:
116
+
117
+ ```
118
+ generate → validate → run → failures? → classify → fix test code → run again → pass
119
+ ```
120
+
121
+ This continues until the tests pass or the issue is classified as an application bug that needs a developer fix.
122
+
123
+ ### Step 5: Ship it
124
+
125
+ ```
126
+ /qa-pr --ticket PROJ-456 "password reset tests"
127
+ ```
128
+
129
+ QAA creates a branch following your team's naming convention (it asked you once and remembers forever), commits the test files, pushes, and opens a draft PR on GitHub, Azure DevOps, or GitLab — whatever your team uses. You get the link.
130
+
131
+ ### The full daily flow
132
+
133
+ ```
134
+ /qa-map → builds the "brain" (once)
135
+ /create-test "password reset" → generates tests using codebase knowledge
136
+ /qa-validate ./tests/unit/auth* → validates + auto-fixes
137
+ /qa-pr --ticket PROJ-456 "password reset tests" → draft PR with link
138
+ ```
139
+
140
+ From ticket to PR in minutes, not hours. And the tests follow the same standards as every other test in the repo because QAA read the existing patterns first.
141
+
142
+ ### What about tickets?
143
+
144
+ If you work from Jira, Linear, or GitHub Issues, skip the manual description:
145
+
146
+ ```
147
+ /qa-from-ticket https://company.atlassian.net/browse/PROJ-456
148
+ ```
149
+
150
+ QAA fetches the ticket, extracts acceptance criteria and edge cases, maps each criterion to test cases with a traceability matrix, generates the tests, validates them, and gives you a report showing which AC is covered by which test.
151
+
152
+ ### When tests break after a deploy
153
+
154
+ ```
155
+ /qa-fix ./tests/e2e/checkout*
156
+ ```
157
+
158
+ QAA reads the failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes the test code errors. Application bugs get flagged for the dev team with evidence — the exact assertion that failed, what was expected, and what was received.
159
+
160
+ ## Standards
161
+
162
+ Every test artifact follows strict rules:
163
+
164
+ - **Testing pyramid** — 60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E
165
+ - **Locator hierarchy** — data-testid first, ARIA roles, labels, CSS as last resort (with TODO)
166
+ - **Page Object Model** — one class per page, no assertions in POMs, locators as properties
167
+ - **Assertions** — concrete values only. `expect(status).toBe(200)` not `expect(status).toBeTruthy()`
168
+ - **Naming** — unique IDs per test case: `UT-AUTH-001`, `API-USERS-003`, `E2E-CHECKOUT-001`
169
+
170
+ ## Learning System
171
+
172
+ QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with feature/" — it saves the rule permanently. Next time, every agent reads your preferences before generating output.
173
+
174
+ Preferences override defaults. Your team's conventions always win.
175
+
176
+ ## Numbers
177
+
178
+ 17 commands. 7 skills. 11 agents. 10 templates. 7 workflows.
179
+
180
+ Supports GitHub, Azure DevOps, and GitLab. Works with Playwright, Cypress, Jest, Vitest, pytest, and more — detects what the project uses and matches it.
181
+
182
+ One goal: you focus on building features, QAA handles the tests.
@@ -0,0 +1,156 @@
1
+ # Testing Guide
2
+
3
+ How to validate the QA Automation Agent against real repositories.
4
+
5
+ ## Test Repos
6
+
7
+ Each workflow option requires a different type of repository to test against.
8
+
9
+ ### Option 1 — Dev Only (No QA Repo)
10
+
11
+ Test the full generation pipeline: scan → codebase-map → analyze → generate → validate → [e2e-run] → deliver.
12
+
13
+ | Repo | Stack | Why it's good |
14
+ |------|-------|---------------|
15
+ | [devdbrandy/restful-ecommerce](https://github.com/devdbrandy/restful-ecommerce) | Node.js, Express | Minimalist e-commerce API. Products, orders, auth. Almost no tests — agent generates everything from scratch. |
16
+ | [themodernmonk7/E-commerce-API](https://github.com/themodernmonk7/E-commerce-API) | Express, MongoDB | Full CRUD: auth, products, orders, reviews. No QA repo. Good variety of endpoints. |
17
+ | [dinushchathurya/nodejs-ecommerce-api](https://github.com/dinushchathurya/nodejs-ecommerce-api) | Express, MongoDB | Full e-commerce: users, categories, products, orders, images. No tests. |
18
+
19
+ **How to test:**
20
+ ```bash
21
+ git clone https://github.com/devdbrandy/restful-ecommerce.git /tmp/test-option1
22
+ cd /tmp/test-option1
23
+ # Open Claude Code in this directory, then:
24
+ /qa-start --dev-repo .
25
+ ```
26
+
27
+ **Expected output:**
28
+ - SCAN_MANIFEST.md with Node.js/Express detection
29
+ - QA_ANALYSIS.md with architecture overview and risk assessment
30
+ - TEST_INVENTORY.md with 30+ test cases (pyramid-driven)
31
+ - QA_REPO_BLUEPRINT.md (since no QA repo exists)
32
+ - Generated test files (unit, API, E2E)
33
+ - VALIDATION_REPORT.md
34
+ - Draft PR with all artifacts
35
+
36
+ ### Option 2 — Dev + Immature QA Repo
37
+
38
+ Test the gap analysis and augmentation pipeline.
39
+
40
+ | DEV Repo | Stack | QA Repo |
41
+ |----------|-------|---------|
42
+ | [oozdal/to-do-list-api](https://github.com/oozdal/to-do-list-api) | FastAPI, SQLAlchemy | Create a crude QA repo manually (see below) |
43
+ | [KenMwaura1/Fast-Api-example](https://github.com/KenMwaura1/Fast-Api-example) | FastAPI, PostgreSQL, SQLAlchemy | Create a crude QA repo manually (see below) |
44
+
45
+ **Creating a crude QA repo for testing:**
46
+
47
+ No open-source repos have intentionally bad tests. Create one manually:
48
+
49
+ ```bash
50
+ mkdir /tmp/test-option2-qa
51
+ cd /tmp/test-option2-qa
52
+
53
+ # Create a broken test file with hardcoded tokens
54
+ cat > tests/test_api.py << 'EOF'
55
+ import requests
56
+
57
+ # TODO: ask Juan for new token
58
+ TOKEN = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.expired"
59
+ BASE_URL = "http://localhost:8000"
60
+
61
+ def test_get_todos():
62
+ r = requests.get(f"{BASE_URL}/todos", headers={"Authorization": f"Bearer {TOKEN}"})
63
+ assert r.status_code # No specific assertion!
64
+
65
+ def test_create_todo():
66
+ r = requests.post(f"{BASE_URL}/todos", json={"title": "test"})
67
+ # No assertion at all
68
+ EOF
69
+
70
+ # Create a broken Selenium test
71
+ cat > old_tests/test_ui.py << 'EOF'
72
+ from selenium import webdriver
73
+ # This import doesn't work anymore
74
+ from selenium.webdriver.common.keys import Keys
75
+
76
+ def test_login():
77
+ driver = webdriver.Chrome() # Will fail - no ChromeDriver
78
+ driver.get("http://localhost:3000/login")
79
+ driver.find_element_by_class_name("login-btn").click() # Deprecated API
80
+ EOF
81
+
82
+ # Create empty postman collection
83
+ echo '{"info": {"name": "Todo API"}, "item": []}' > postman/collection.json
84
+
85
+ # Bad README
86
+ echo "# QA Tests\nSome tests might need updating. Ask the team." > README.md
87
+
88
+ git init && git add -A && git commit -m "initial qa setup"
89
+ ```
90
+
91
+ **How to test:**
92
+ ```bash
93
+ cd /tmp/test-option2-dev # The FastAPI dev repo
94
+ /qa-start --dev-repo . --qa-repo /tmp/test-option2-qa
95
+ ```
96
+
97
+ **Expected output:**
98
+ - Maturity score < 30 → Option 2 detected
99
+ - GAP_ANALYSIS.md showing broken tests, missing coverage
100
+ - Fixed test files (imports, assertions, configs)
101
+ - New test cases added for uncovered features
102
+ - VALIDATION_REPORT.md
103
+
104
+ ### Option 3 — Dev + Mature QA Repo
105
+
106
+ Test the surgical additions pipeline.
107
+
108
+ | QA Repo | Stack | Why it's good |
109
+ |---------|-------|---------------|
110
+ | [OmonUrkinbaev/playwright-qa-automation](https://github.com/OmonUrkinbaev/playwright-qa-automation) | Playwright, TypeScript, POM | Production-style: UI + API tests, POM, data-driven, flaky handling, CI pipelines. |
111
+ | [idavidov13/Playwright-Framework](https://github.com/idavidov13/Playwright-Framework) | Playwright, TypeScript, POM | Custom fixtures, API mocking, Zod validation, GitHub Actions + GitLab CI. |
112
+ | [nareshnavinash/playwright-TS-pom](https://github.com/nareshnavinash/playwright-TS-pom) | Playwright, TypeScript, POM | GitLab CI, DataDog integration, ESLint, junit + HTML reporting. |
113
+
114
+ **How to test:**
115
+ ```bash
116
+ # Clone the app being tested (the QA repo tests against a demo app)
117
+ git clone https://github.com/OmonUrkinbaev/playwright-qa-automation.git /tmp/test-option3-qa
118
+
119
+ # You need the DEV repo that the QA repo tests against
120
+ # Check the QA repo's README for the target application URL/repo
121
+
122
+ /qa-start --dev-repo /tmp/test-option3-dev --qa-repo /tmp/test-option3-qa
123
+ ```
124
+
125
+ **Expected output:**
126
+ - Maturity score > 70 → Option 3 detected
127
+ - GAP_ANALYSIS.md showing thin coverage areas only
128
+ - Only missing tests added — existing tests untouched
129
+ - Existing POM conventions respected
130
+ - VALIDATION_REPORT.md
131
+
132
+ ## Validation Checklist
133
+
134
+ After each test run, verify:
135
+
136
+ - [ ] Correct workflow option detected (1, 2, or 3)
137
+ - [ ] SCAN_MANIFEST.md has correct framework detection
138
+ - [ ] QA_ANALYSIS.md has real architecture info (not generic)
139
+ - [ ] TEST_INVENTORY.md has concrete inputs and expected outcomes
140
+ - [ ] Generated test files follow CLAUDE.md standards
141
+ - [ ] All locators use Tier 1 (data-testid) or Tier 2 (ARIA roles)
142
+ - [ ] No assertions use toBeTruthy/toBeDefined
143
+ - [ ] POM has no assertions (if E2E tests generated)
144
+ - [ ] VALIDATION_REPORT.md shows 4-layer check results
145
+ - [ ] Draft PR created with full summary and reviewer checklist
146
+ - [ ] No hardcoded credentials in any generated file
147
+
148
+ ## Troubleshooting Test Runs
149
+
150
+ | Issue | Fix |
151
+ |-------|-----|
152
+ | `gh: command not found` | Install GitHub CLI: `brew install gh` or `winget install GitHub.cli` |
153
+ | `gh auth` error | Run `gh auth login` first |
154
+ | Agent detects wrong framework | Check SCAN_MANIFEST.md — may need to adjust scanner detection patterns |
155
+ | Maturity score seems wrong | Review the 5 scoring dimensions in init.cjs cmdInitQaStart |
156
+ | Tests fail to validate | Check if the test framework is installed in the target repo |