qaa-agent 1.7.0 → 1.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,380 @@
1
+ # QAA - QA Automation Agent
2
+
3
+ [![npm version](https://img.shields.io/npm/v/qaa-agent.svg)](https://www.npmjs.com/package/qaa-agent)
4
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
5
+
6
+ Multi-agent QA pipeline for [Claude Code](https://docs.anthropic.com/en/docs/claude-code). Analyzes any codebase, generates a complete test suite following industry standards, validates everything, and delivers the result as a draft pull request.
7
+
8
+ ```
9
+ scan → map → analyze → plan → generate → validate → deliver
10
+ ```
11
+
12
+ No manual test writing. No guessing what to cover. One command, full pipeline.
13
+
14
+ ---
15
+
16
+ ## The Problem
17
+
18
+ - **Starting from zero is painful** — a new project with no tests means weeks of setup
19
+ - **Coverage gaps are invisible** — without analysis, teams don't know what's missing until production breaks
20
+ - **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming
21
+ - **QA is always behind dev** — features ship faster than tests get written
22
+
23
+ ## The Solution
24
+
25
+ QAA runs a pipeline of 12 specialized AI agents, each responsible for one stage:
26
+
27
+ | Stage | What happens | Output |
28
+ |-------|-------------|--------|
29
+ | **Scan** | Detects framework, language, testable surfaces | `SCAN_MANIFEST.md` |
30
+ | **Map** | Deep-scans codebase with 4 parallel agents (testability, risk, patterns, existing tests) | 8 codebase documents |
31
+ | **Analyze** | Produces risk assessment, test inventory, testing pyramid | `QA_ANALYSIS.md`, `TEST_INVENTORY.md` |
32
+ | **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | `GENERATION_PLAN.md` |
33
+ | **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
34
+ | **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | `VALIDATION_REPORT.md` |
35
+ | **Deliver** | Creates branch, commits per stage, opens draft PR | Pull request URL |
36
+
37
+ ---
38
+
39
+ ## Install
40
+
41
+ ```bash
42
+ npx qaa-agent
43
+ ```
44
+
45
+ The interactive installer:
46
+
47
+ 1. Copies agents, commands, skills, templates, and workflows into your runtime directory
48
+ 2. Configures the [Playwright MCP](https://github.com/anthropics/mcp-playwright) server in your user-scope config (`~/.claude.json`) so it's available in **all projects**
49
+ 3. Merges required permissions into `settings.json`
50
+
51
+ **Supported runtimes:** Claude Code, OpenCode
52
+
53
+ **Install scope:** Global (`~/.claude/`, available in all projects) or Local (`./.claude/`, this project only)
54
+
55
+ ### Requirements
56
+
57
+ - [Node.js](https://nodejs.org/) 18+
58
+ - [Claude Code](https://docs.anthropic.com/en/docs/claude-code) installed
59
+
60
+ ### Playwright MCP (required for E2E)
61
+
62
+ QAA uses [`@playwright/mcp`](https://www.npmjs.com/package/@playwright/mcp) to open a real browser, extract locators from live pages, run E2E tests, and auto-fix locator mismatches.
63
+
64
+ **You need to install the Playwright MCP server manually in your environment:**
65
+
66
+ <details>
67
+ <summary><strong>VS Code (Claude Code extension)</strong></summary>
68
+
69
+ 1. Open VS Code Settings (`Ctrl+Shift+P` > `Preferences: Open User Settings (JSON)`)
70
+ 2. Add the MCP server config:
71
+
72
+ ```json
73
+ {
74
+ "claude-code.mcpServers": {
75
+ "playwright": {
76
+ "command": "npx",
77
+ "args": ["@playwright/mcp@latest"]
78
+ }
79
+ }
80
+ }
81
+ ```
82
+
83
+ Or add it to your project's `.vscode/mcp.json`:
84
+
85
+ ```json
86
+ {
87
+ "servers": {
88
+ "playwright": {
89
+ "command": "npx",
90
+ "args": ["@playwright/mcp@latest"]
91
+ }
92
+ }
93
+ }
94
+ ```
95
+
96
+ </details>
97
+
98
+ <details>
99
+ <summary><strong>Claude Code CLI</strong></summary>
100
+
101
+ Add to `~/.claude.json` (user-scope, all projects):
102
+
103
+ ```json
104
+ {
105
+ "mcpServers": {
106
+ "playwright": {
107
+ "command": "npx",
108
+ "args": ["@playwright/mcp@latest"]
109
+ }
110
+ }
111
+ }
112
+ ```
113
+
114
+ Or add a `.mcp.json` file in your project root for project-scope only.
115
+
116
+ </details>
117
+
118
+ Once configured, Playwright MCP enables QAA to:
119
+ - Open a real browser and navigate your running app
120
+ - Extract actual locators (`data-testid`, ARIA roles, labels) from live pages
121
+ - Run E2E tests, capture failures, and auto-fix locator mismatches
122
+ - Build a persistent **Locator Registry** (`.qa-output/locators/`) that caches real locators across features
123
+
124
+ ---
125
+
126
+ ## Quick Start
127
+
128
+ ### New project, no tests
129
+
130
+ ```
131
+ /qa-start --dev-repo ./myproject --auto
132
+ ```
133
+
134
+ Runs the full pipeline end-to-end: scan, map, analyze, plan, generate, validate, and deliver as a draft PR.
135
+
136
+ ### Mature project, new feature
137
+
138
+ ```
139
+ /qa-map # build the "brain" (once)
140
+ /qa-create-test "password reset" # generate tests using codebase knowledge
141
+ /qa-pr --ticket PROJ-123 "password reset tests" # ship as draft PR
142
+ ```
143
+
144
+ ### From a Jira ticket
145
+
146
+ ```
147
+ /qa-from-ticket https://company.atlassian.net/browse/PROJ-456
148
+ /qa-pr --ticket PROJ-456 "login flow tests"
149
+ ```
150
+
151
+ ### Fix broken tests after a deploy
152
+
153
+ ```
154
+ /qa-fix ./tests/e2e/checkout*
155
+ /qa-pr --ticket PROJ-789 "fix checkout tests"
156
+ ```
157
+
158
+ ---
159
+
160
+ ## Commands
161
+
162
+ | Command | Purpose |
163
+ |---------|---------|
164
+ | `/qa-start` | Full pipeline end-to-end (scan through PR) |
165
+ | `/qa-map` | Deep codebase analysis with 4 parallel agents |
166
+ | `/qa-create-test <feature>` | Generate tests for a specific feature |
167
+ | `/qa-fix [path]` | Diagnose and fix broken tests |
168
+ | `/qa-audit [path]` | 6-dimension quality audit with scoring |
169
+ | `/qa-pr` | Create a draft pull request from QA artifacts |
170
+ | `/qa-testid [path]` | Inject `data-testid` attributes into components |
171
+
172
+ ### Additional Commands
173
+
174
+ | Command | Purpose |
175
+ |---------|---------|
176
+ | `/qa-from-ticket <url>` | Generate tests from a Jira/Linear/GitHub Issue |
177
+ | `/qa-analyze` | Analyze a repo without generating tests |
178
+ | `/qa-validate [path]` | Validate test files against standards |
179
+ | `/qa-gap` | Find coverage gaps between dev and QA repos |
180
+ | `/qa-report` | Generate a QA status report |
181
+ | `/qa-audit` | Full quality audit with weighted scoring |
182
+ | `/qa-blueprint` | Generate QA repo structure from scratch |
183
+ | `/qa-research` | Research best testing stack for a project |
184
+ | `/qa-pom` | Generate Page Object Models |
185
+ | `/update-test` | Improve existing tests incrementally |
186
+
187
+ See [COMMANDS.md](docs/COMMANDS.md) for full usage details and flags.
188
+
189
+ ---
190
+
191
+ ## Three Workflows
192
+
193
+ QAA adapts to the project's QA maturity:
194
+
195
+ **Option 1: No QA repo yet** — Full pipeline from scratch. Produces a complete test suite, repo blueprint, and draft PR.
196
+
197
+ ```
198
+ /qa-start --dev-repo ./myproject
199
+ ```
200
+
201
+ **Option 2: Immature QA repo** — Scans both repos, fixes broken tests, fills coverage gaps, standardizes existing tests.
202
+
203
+ ```
204
+ /qa-start --dev-repo ./myproject --qa-repo ./tests
205
+ ```
206
+
207
+ **Option 3: Mature QA repo** — Surgical additions only. Finds thin coverage areas and adds targeted tests without touching working code.
208
+
209
+ ```
210
+ /qa-start --dev-repo ./myproject --qa-repo ./tests
211
+ ```
212
+
213
+ ---
214
+
215
+ ## The "Brain" — Codebase Map
216
+
217
+ Before generating anything, QAA maps the codebase with 4 parallel agents producing 8 documents:
218
+
219
+ | Focus | Documents |
220
+ |-------|-----------|
221
+ | **Testability** | `TESTABILITY.md`, `TEST_SURFACE.md` — what's testable, entry points, mock boundaries |
222
+ | **Risk** | `RISK_MAP.md`, `CRITICAL_PATHS.md` — business-critical paths, security-sensitive areas |
223
+ | **Patterns** | `CODE_PATTERNS.md`, `API_CONTRACTS.md` — naming conventions, API shapes, import style |
224
+ | **Existing tests** | `TEST_ASSESSMENT.md`, `COVERAGE_GAPS.md` — current quality, frameworks, gaps |
225
+
226
+ Every downstream agent reads these documents. The result: generated tests feel native to the codebase, not generic boilerplate.
227
+
228
+ ---
229
+
230
+ ## Standards Enforced
231
+
232
+ Every generated artifact follows strict rules defined in [CLAUDE.md](CLAUDE.md):
233
+
234
+ ### Testing Pyramid
235
+
236
+ ```
237
+ / E2E \ 3-5% (critical path smoke only)
238
+ / API \ 20-25% (endpoints + contracts)
239
+ / Integration\ 10-15% (component interactions)
240
+ / Unit \ 60-70% (business logic, pure functions)
241
+ ```
242
+
243
+ ### Locator Hierarchy
244
+
245
+ 1. **Tier 1 (Best):** `data-testid`, ARIA roles with accessible names
246
+ 2. **Tier 2 (Good):** Form labels, placeholders, visible text
247
+ 3. **Tier 3 (Acceptable):** Alt text, title attributes
248
+ 4. **Tier 4 (Last Resort):** CSS selectors, XPath — always with a `// TODO` comment
249
+
250
+ ### Page Object Model
251
+
252
+ - One class per page, no god objects
253
+ - No assertions in POMs — assertions belong in test specs
254
+ - Locators as readonly properties
255
+ - Every POM extends a shared `BasePage`
256
+
257
+ ### Assertion Quality
258
+
259
+ ```typescript
260
+ // Good — concrete values
261
+ expect(response.status).toBe(200);
262
+ expect(data.name).toBe('Test User');
263
+
264
+ // Bad — never do this
265
+ expect(response.status).toBeTruthy();
266
+ expect(data).toBeDefined();
267
+ ```
268
+
269
+ ### Test Case IDs
270
+
271
+ Every test case has a unique ID following the pattern:
272
+ - `UT-MODULE-001` — unit tests
273
+ - `INT-MODULE-001` — integration tests
274
+ - `API-RESOURCE-001` — API tests
275
+ - `E2E-FLOW-001` — E2E tests
276
+
277
+ ---
278
+
279
+ ## Validation
280
+
281
+ Generated tests pass through a 4-layer validation with auto-fix (up to 3 loops):
282
+
283
+ 1. **Syntax** — does it parse? Are imports correct?
284
+ 2. **Structure** — POM rules, file organization, naming conventions
285
+ 3. **Dependencies** — all imports resolve, mocks set up correctly
286
+ 4. **Logic** — assertions are concrete, locators follow tier hierarchy
287
+
288
+ If issues remain, the **Bug Detective** classifies each failure:
289
+
290
+ | Classification | Action |
291
+ |----------------|--------|
292
+ | `APPLICATION BUG` | Flagged for developer — not auto-fixed |
293
+ | `TEST CODE ERROR` | Auto-fixed at HIGH confidence |
294
+ | `ENVIRONMENT ISSUE` | Documented with setup instructions |
295
+ | `INCONCLUSIVE` | Flagged with evidence for manual review |
296
+
297
+ ---
298
+
299
+ ## Framework Support
300
+
301
+ QAA auto-detects the project's existing stack and matches it:
302
+
303
+ **Languages:** JavaScript/TypeScript, Python, Java, .NET/C#, Go, Ruby, PHP, Rust
304
+
305
+ **Test Frameworks:** Playwright, Cypress, Jest, Vitest, pytest, Selenium, and more
306
+
307
+ **Build Tools:** Vite, Next.js, Nuxt, Angular, Vue, Webpack, SvelteKit
308
+
309
+ **Git Platforms:** GitHub, Azure DevOps, GitLab
310
+
311
+ ---
312
+
313
+ ## Learning System
314
+
315
+ QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with `feature/`" — it saves the rule permanently to `MY_PREFERENCES.md`. Every agent reads your preferences before generating output.
316
+
317
+ Your team's conventions always win over defaults.
318
+
319
+ ---
320
+
321
+ ## Architecture
322
+
323
+ ```
324
+ qaa-agent/
325
+ agents/ # 12 specialized QA agents
326
+ commands/ # 7 slash commands (user-facing entry points)
327
+ skills/ # 6 reusable skills
328
+ templates/ # 10 artifact templates (output format contracts)
329
+ workflows/ # 7 workflow orchestration specs
330
+ bin/ # Installer and CLI tools
331
+ docs/ # User documentation
332
+ CLAUDE.md # QA standards (read by every agent)
333
+ .mcp.json # Playwright MCP server config
334
+ settings.json # Claude Code permissions
335
+ ```
336
+
337
+ ### Agents
338
+
339
+ | Agent | Responsibility |
340
+ |-------|---------------|
341
+ | `qa-scanner` | Framework detection, file tree scanning |
342
+ | `qa-codebase-mapper` | 4-parallel-agent deep analysis |
343
+ | `qa-analyzer` | Risk assessment, test inventory, pyramid |
344
+ | `qa-planner` | Test case grouping, file assignment |
345
+ | `qa-executor` | Test file, POM, fixture generation |
346
+ | `qa-validator` | 4-layer validation with auto-fix |
347
+ | `qa-e2e-runner` | Browser-based test execution via Playwright MCP |
348
+ | `qa-bug-detective` | Failure classification with evidence |
349
+ | `qa-testid-injector` | `data-testid` attribute injection |
350
+ | `qa-project-researcher` | Testing stack research |
351
+ | `qa-discovery` | Project discovery |
352
+ | `qa-pipeline-orchestrator` | Pipeline coordination |
353
+
354
+ ---
355
+
356
+ ## Git Workflow
357
+
358
+ QAA follows strict git conventions:
359
+
360
+ - **Branch:** `qa/auto-{project}-{date}` (e.g., `qa/auto-shopflow-2026-03-18`)
361
+ - **Commits:** One per agent stage — `qa(scanner): produce SCAN_MANIFEST.md for shopflow`
362
+ - **PR:** Draft PR with analysis summary, test counts, coverage metrics, validation status
363
+
364
+ ---
365
+
366
+ ## Documentation
367
+
368
+ - [Commands Reference](docs/COMMANDS.md) — all commands with flags and examples
369
+ - [Demo & Walkthrough](docs/DEMO.md) — problem/solution explanation with real workflows
370
+ - [Changelog](CHANGELOG.md) — version history
371
+
372
+ ---
373
+
374
+ ## License
375
+
376
+ [MIT](LICENSE)
377
+
378
+ ---
379
+
380
+ Built by [Capmation](https://github.com/capmation)
package/bin/install.cjs CHANGED
@@ -146,9 +146,21 @@ async function main() {
146
146
  // Install .mcp.json (Playwright MCP server config)
147
147
  const mcpSrc = path.join(ROOT, '.mcp.json');
148
148
  if (fs.existsSync(mcpSrc)) {
149
- const mcpDest = path.join(qaaDir, '.mcp.json');
150
- copyFile(mcpSrc, mcpDest);
151
- ok('Installed Playwright MCP server config (.mcp.json)');
149
+ // Copy to qaa dir for reference
150
+ copyFile(mcpSrc, path.join(qaaDir, '.mcp.json'));
151
+
152
+ // Merge MCP servers into ~/.claude.json (user-scope) so they're available in ALL projects
153
+ // Note: ~/.claude/.mcp.json is project-scope for ~/.claude/ only — NOT global
154
+ const userConfigPath = path.join(HOME, '.claude.json');
155
+ let userConfig = {};
156
+ if (fs.existsSync(userConfigPath)) {
157
+ try { userConfig = JSON.parse(fs.readFileSync(userConfigPath, 'utf8')); } catch {}
158
+ }
159
+ userConfig.mcpServers = userConfig.mcpServers || {};
160
+ const qaaMcp = JSON.parse(fs.readFileSync(mcpSrc, 'utf8'));
161
+ Object.assign(userConfig.mcpServers, qaaMcp.mcpServers);
162
+ fs.writeFileSync(userConfigPath, JSON.stringify(userConfig, null, 2));
163
+ ok('Installed Playwright MCP server config (user-scope — available in all projects)');
152
164
  }
153
165
 
154
166
  // Write version
@@ -5,12 +5,13 @@ Validate, diagnose, and fix test files — all in one command. Runs 4-layer stat
5
5
  ## Usage
6
6
 
7
7
  ```
8
- /qa-fix [<test-directory>] [options]
8
+ /qa-fix [<test-files-or-directory>] [options]
9
9
  ```
10
10
 
11
11
  ### Options
12
12
 
13
- - `<test-directory>` — path to test files (auto-detects if omitted)
13
+ - `<test-files-or-directory>` — one or more test file paths or a directory (auto-detects if omitted)
14
+ - `--check` — **final check mode**: full quality verification against company preferences, codebase conventions, and execution
14
15
  - `--validate-only` — run 4-layer static validation only, no test execution or classification
15
16
  - `--classify` — run tests and classify failures, but do NOT auto-fix
16
17
  - `--run --app-url <url>` — also execute E2E tests against live app after static validation
@@ -20,7 +21,9 @@ Validate, diagnose, and fix test files — all in one command. Runs 4-layer stat
20
21
  ### Mode Detection
21
22
 
22
23
  ```
23
- if --validate-only:
24
+ if --check:
25
+ MODE = "check" → full quality check + execution + QA_CHECK_REPORT.md
26
+ elif --validate-only:
24
27
  MODE = "validate" → 4-layer static validation + VALIDATION_REPORT.md
25
28
  elif --classify:
26
29
  MODE = "classify" → run tests + classify failures (no auto-fix)
@@ -32,6 +35,8 @@ else:
32
35
 
33
36
  | Mode | Artifacts |
34
37
  |------|-----------|
38
+ | check | QA_CHECK_REPORT.md (full quality verification with pass/fail per test file) |
39
+ | check --ticket | QA_CHECK_REPORT.md + UAT_VERIFICATION.md (step-by-step screenshots vs ticket acceptance criteria) |
35
40
  | validate | VALIDATION_REPORT.md (syntax, structure, dependencies, logic per file) |
36
41
  | classify | FAILURE_CLASSIFICATION_REPORT.md (per-failure evidence, no fixes) |
37
42
  | fix | FAILURE_CLASSIFICATION_REPORT.md + auto-fixed test files |
@@ -57,6 +62,249 @@ App URL: {url or "auto-detect"}
57
62
 
58
63
  ---
59
64
 
65
+ ### CHECK MODE (`--check`) — Final Quality Verification
66
+
67
+ Full quality check for specific test files. Reads ALL context sources, verifies every aspect of the tests, runs them, and produces a pass/fail report. Use this as a final gate before delivering tests.
68
+
69
+ **Accepts specific files:**
70
+ ```
71
+ /qa-fix --check tests/e2e/login.e2e.spec.ts tests/e2e/checkout.e2e.spec.ts
72
+ /qa-fix --check tests/unit/auth.unit.spec.ts
73
+ /qa-fix --check tests/e2e/ --app-url http://localhost:3000
74
+ ```
75
+
76
+ **Step 1: Read ALL context sources**
77
+
78
+ Read every available context source — this is not optional, all must be read:
79
+
80
+ 1. **CLAUDE.md** — QA standards, POM rules, locator tiers, assertion rules, naming conventions, quality gates
81
+ 2. **~/.claude/qaa/MY_PREFERENCES.md** — company/user preferences that OVERRIDE CLAUDE.md rules
82
+ 3. **Codebase map** (`.qa-output/codebase/`):
83
+ - `CODE_PATTERNS.md` — naming conventions, import style, file organization (are tests matching the project's style?)
84
+ - `API_CONTRACTS.md` — real API shapes (are API test payloads correct?)
85
+ - `TEST_SURFACE.md` — function signatures (are test targets real?)
86
+ - `TESTABILITY.md` — mock boundaries (are mocks set up correctly?)
87
+ 4. **Locator Registry** (`.qa-output/locators/`) — real locators from the app (are POM locators accurate?)
88
+ 5. **Existing test patterns** — read 2-3 existing test files in the same repo to understand current conventions (describe block style, import patterns, assertion patterns, fixture usage)
89
+
90
+ If codebase map is missing, STOP and tell the user to run `/qa-map` first.
91
+
92
+ **Step 2: Verify each test file across 7 dimensions**
93
+
94
+ For EACH selected test file, check:
95
+
96
+ | # | Dimension | What to check | Source |
97
+ |---|-----------|---------------|--------|
98
+ | 1 | **Naming** | File name, test IDs, describe/it names follow conventions | CLAUDE.md + CODE_PATTERNS.md + MY_PREFERENCES.md |
99
+ | 2 | **Structure** | Correct directory, imports resolve, follows repo patterns | CODE_PATTERNS.md + existing tests in repo |
100
+ | 3 | **Locators** | POM locators match registry, Tier 1 preferred, no stale selectors | LOCATOR_REGISTRY.md |
101
+ | 4 | **Assertions** | Concrete values (no toBeTruthy alone), match API contracts | CLAUDE.md + API_CONTRACTS.md |
102
+ | 5 | **POM compliance** | No assertions in POMs, locators as properties, extends BasePage | CLAUDE.md |
103
+ | 6 | **Code quality** | No redundant code, no dead code, no hardcoded credentials, no copy-paste | Code review |
104
+ | 7 | **Company conventions** | Matches all rules in MY_PREFERENCES.md | MY_PREFERENCES.md |
105
+
106
+ **Step 3: Run the tests**
107
+
108
+ Execute the selected test files:
109
+
110
+ ```bash
111
+ # Detect test runner from project config
112
+ npx playwright test {files} --reporter=json 2>&1 # if Playwright
113
+ npx cypress run --spec {files} 2>&1 # if Cypress
114
+ npx jest {files} --json 2>&1 # if Jest
115
+ npx vitest run {files} --reporter=json 2>&1 # if Vitest
116
+ ```
117
+
118
+ If E2E tests and app URL available, also verify with Playwright MCP:
119
+ - Navigate to each page referenced in the tests
120
+ - `browser_snapshot()` to verify elements exist in DOM
121
+ - Cross-reference locators against real page
122
+
123
+ **Step 4: Fix issues found**
124
+
125
+ For each issue found:
126
+ - **AUTO-FIX** (HIGH confidence): naming, imports, locator mismatches, missing await, Tier 4→Tier 1 upgrade when registry has the value
127
+ - **FLAG for review** (MEDIUM/LOW): logic changes, assertion value changes, structural refactors
128
+ - Re-run tests after fixes (max 5 loops)
129
+
130
+ **Step 5: Produce QA_CHECK_REPORT.md**
131
+
132
+ ```markdown
133
+ # QA Check Report
134
+
135
+ ## Summary
136
+
137
+ | Metric | Value |
138
+ |--------|-------|
139
+ | Files checked | {N} |
140
+ | Dimensions checked | 7 |
141
+ | Issues found | {N} |
142
+ | Auto-fixed | {N} |
143
+ | Flagged for review | {N} |
144
+ | Tests passed | {N}/{total} |
145
+ | Overall | PASS / PASS WITH WARNINGS / FAIL |
146
+
147
+ ## Per-File Results
148
+
149
+ ### {file_path}
150
+
151
+ | Dimension | Status | Details |
152
+ |-----------|--------|---------|
153
+ | Naming | PASS/FAIL | {specific details} |
154
+ | Structure | PASS/FAIL | {specific details} |
155
+ | Locators | PASS/FAIL | {specific details} |
156
+ | Assertions | PASS/FAIL | {specific details} |
157
+ | POM compliance | PASS/FAIL | {specific details} |
158
+ | Code quality | PASS/FAIL | {specific details} |
159
+ | Company conventions | PASS/FAIL | {specific details} |
160
+
161
+ **Test execution:** PASS / FAIL ({error if failed})
162
+ **Fixes applied:** {list of auto-fixes}
163
+ **Flagged for review:** {list of items needing human review}
164
+
165
+ [... repeat per file ...]
166
+
167
+ ## Flagged Items (Needs Human Review)
168
+
169
+ | File | Dimension | Issue | Suggested Fix |
170
+ |------|-----------|-------|---------------|
171
+ | ... | ... | ... | ... |
172
+ ```
173
+
174
+ Write to `.qa-output/QA_CHECK_REPORT.md`.
175
+
176
+ Present results to user with clear PASS/FAIL per file and overall status.
177
+
178
+ **Step 6 (optional): Ticket Verification (`--ticket <source>`)**
179
+
180
+ If `--ticket` flag is provided, perform UAT verification — walk through the test flow step-by-step in the browser, take screenshots at each step, and compare against the ticket's acceptance criteria.
181
+
182
+ **Usage:**
183
+ ```
184
+ /qa-fix --check --ticket #123 tests/e2e/login.e2e.spec.ts --app-url http://localhost:3000
185
+ /qa-fix --check --ticket https://company.atlassian.net/browse/PROJ-456 tests/e2e/checkout.e2e.spec.ts
186
+ /qa-fix --check --ticket "User logs in, sees dashboard with welcome message, clicks profile" tests/e2e/login.e2e.spec.ts
187
+ ```
188
+
189
+ **Requires:** `--app-url` or auto-detected running app. Cannot do ticket verification without a live app.
190
+
191
+ **Step 6a: Fetch and parse the ticket**
192
+
193
+ Same ticket parsing as `/qa-create-test` from-ticket mode:
194
+ - GitHub Issue: `gh issue view` → extract title, body, ACs
195
+ - Jira/Linear URL: `WebFetch` → extract content
196
+ - Plain text: use directly as acceptance criteria
197
+ - File path: read file content
198
+
199
+ Extract:
200
+ - Acceptance criteria (AC-1, AC-2, ...)
201
+ - Expected user flow (step-by-step)
202
+ - Expected outcomes per step
203
+
204
+ **Step 6b: Walk through the flow with Playwright MCP**
205
+
206
+ For each E2E test file being checked, replay the user journey manually in the browser step-by-step:
207
+
208
+ ```
209
+ For each step in the ticket's user flow:
210
+
211
+ 1. Execute the action described in the step:
212
+ - Navigate: mcp__playwright__browser_navigate({ url: "{page}" })
213
+ - Fill form: mcp__playwright__browser_fill_form({ ... })
214
+ - Click: mcp__playwright__browser_click({ element: "..." })
215
+ - Wait: mcp__playwright__browser_wait_for({ text: "..." })
216
+
217
+ 2. Take screenshot AFTER the action:
218
+ mcp__playwright__browser_take_screenshot()
219
+ → Save to .qa-output/uat-screenshots/{test-name}-step-{N}.png
220
+
221
+ 3. Take accessibility snapshot to read page state:
222
+ mcp__playwright__browser_snapshot()
223
+
224
+ 4. Record what the page shows:
225
+ - URL after action
226
+ - Visible text/headings
227
+ - Form state
228
+ - Error messages (if any)
229
+ - Elements visible/hidden
230
+ ```
231
+
232
+ **Step 6c: Compare actual vs ticket**
233
+
234
+ For each acceptance criterion from the ticket:
235
+
236
+ | AC | Expected (from ticket) | Actual (from browser) | Screenshot | Verdict |
237
+ |----|----------------------|---------------------|------------|---------|
238
+ | AC-1 | User sees login form | Login form visible with email/password fields | step-1.png | MATCH |
239
+ | AC-2 | After login, redirect to dashboard | Redirected to /dashboard, "Welcome" visible | step-3.png | MATCH |
240
+ | AC-3 | Error message for wrong password | "Invalid credentials" alert shown | step-5.png | MATCH |
241
+ | AC-4 | Remember me keeps session | Session persists after browser close | step-7.png | MISMATCH — session expired |
242
+
243
+ Verdicts:
244
+ - **MATCH** — actual behavior matches what the ticket describes
245
+ - **MISMATCH** — actual behavior differs from ticket (could be app bug OR test not covering this AC)
246
+ - **NOT TESTED** — ticket has an AC but no test step covers it
247
+ - **EXTRA** — test covers something not in the ticket (informational, not a failure)
248
+
249
+ **Step 6d: Produce UAT_VERIFICATION.md**
250
+
251
+ ```markdown
252
+ # UAT Verification Report
253
+
254
+ ## Ticket Info
255
+
256
+ | Field | Value |
257
+ |-------|-------|
258
+ | Source | {ticket URL or text} |
259
+ | Title | {ticket title} |
260
+ | ACs extracted | {count} |
261
+ | Test files verified | {count} |
262
+
263
+ ## Step-by-Step Walkthrough
264
+
265
+ ### Step 1: {action description}
266
+ - **Action:** Navigate to /login
267
+ - **Screenshot:** [step-1.png](.qa-output/uat-screenshots/{test}-step-1.png)
268
+ - **Page state:** Login form visible, email and password fields empty, "Log in" button enabled
269
+ - **Matches AC:** AC-1 ✓
270
+
271
+ ### Step 2: {action description}
272
+ - **Action:** Fill email "test@example.com", password "SecureP@ss123!"
273
+ - **Screenshot:** [step-2.png]
274
+ - **Page state:** Fields filled, button still enabled
275
+ - **Matches AC:** (intermediate step, no AC)
276
+
277
+ [... repeat per step ...]
278
+
279
+ ## AC Coverage Matrix
280
+
281
+ | AC | Description | Tested | Verdict | Evidence |
282
+ |----|-------------|--------|---------|----------|
283
+ | AC-1 | Login form visible | Yes | MATCH | step-1.png |
284
+ | AC-2 | Redirect to dashboard | Yes | MATCH | step-3.png |
285
+ | AC-3 | Error on wrong password | Yes | MATCH | step-5.png |
286
+ | AC-4 | Remember me session | No | NOT TESTED | — |
287
+
288
+ ## Summary
289
+
290
+ | Metric | Value |
291
+ |--------|-------|
292
+ | ACs from ticket | {N} |
293
+ | ACs matched | {N} |
294
+ | ACs mismatched | {N} |
295
+ | ACs not tested | {N} |
296
+ | Screenshots captured | {N} |
297
+ | Overall | PASS / PARTIAL / FAIL |
298
+ ```
299
+
300
+ Write to `.qa-output/UAT_VERIFICATION.md`.
301
+
302
+ If any AC is MISMATCH or NOT TESTED, present to user with recommendation:
303
+ - MISMATCH → "AC-4 says X but the app does Y — either the app has a bug or the test needs updating"
304
+ - NOT TESTED → "AC-4 is not covered by any test step — consider adding a test case"
305
+
306
+ ---
307
+
60
308
  ### VALIDATE MODE (`--validate-only`)
61
309
 
62
310
  1. Read `CLAUDE.md` — quality gates, locator tiers, assertion rules.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "qaa-agent",
3
- "version": "1.7.0",
3
+ "version": "1.7.3",
4
4
  "description": "QA Automation Agent for Claude Code — multi-agent pipeline that analyzes repos, generates tests, validates, and creates PRs",
5
5
  "bin": {
6
6
  "qaa-agent": "./bin/install.cjs"