qaa-agent 1.7.1 → 1.7.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +380 -0
- package/bin/install.cjs +12 -10
- package/package.json +1 -1
package/README.md
ADDED
|
@@ -0,0 +1,380 @@
|
|
|
1
|
+
# QAA - QA Automation Agent
|
|
2
|
+
|
|
3
|
+
[](https://www.npmjs.com/package/qaa-agent)
|
|
4
|
+
[](https://opensource.org/licenses/MIT)
|
|
5
|
+
|
|
6
|
+
Multi-agent QA pipeline for [Claude Code](https://docs.anthropic.com/en/docs/claude-code). Analyzes any codebase, generates a complete test suite following industry standards, validates everything, and delivers the result as a draft pull request.
|
|
7
|
+
|
|
8
|
+
```
|
|
9
|
+
scan → map → analyze → plan → generate → validate → deliver
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
No manual test writing. No guessing what to cover. One command, full pipeline.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## The Problem
|
|
17
|
+
|
|
18
|
+
- **Starting from zero is painful** — a new project with no tests means weeks of setup
|
|
19
|
+
- **Coverage gaps are invisible** — without analysis, teams don't know what's missing until production breaks
|
|
20
|
+
- **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming
|
|
21
|
+
- **QA is always behind dev** — features ship faster than tests get written
|
|
22
|
+
|
|
23
|
+
## The Solution
|
|
24
|
+
|
|
25
|
+
QAA runs a pipeline of 12 specialized AI agents, each responsible for one stage:
|
|
26
|
+
|
|
27
|
+
| Stage | What happens | Output |
|
|
28
|
+
|-------|-------------|--------|
|
|
29
|
+
| **Scan** | Detects framework, language, testable surfaces | `SCAN_MANIFEST.md` |
|
|
30
|
+
| **Map** | Deep-scans codebase with 4 parallel agents (testability, risk, patterns, existing tests) | 8 codebase documents |
|
|
31
|
+
| **Analyze** | Produces risk assessment, test inventory, testing pyramid | `QA_ANALYSIS.md`, `TEST_INVENTORY.md` |
|
|
32
|
+
| **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | `GENERATION_PLAN.md` |
|
|
33
|
+
| **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
|
|
34
|
+
| **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | `VALIDATION_REPORT.md` |
|
|
35
|
+
| **Deliver** | Creates branch, commits per stage, opens draft PR | Pull request URL |
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Install
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
npx qaa-agent
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
The interactive installer:
|
|
46
|
+
|
|
47
|
+
1. Copies agents, commands, skills, templates, and workflows into your runtime directory
|
|
48
|
+
2. Configures the [Playwright MCP](https://github.com/anthropics/mcp-playwright) server in your user-scope config (`~/.claude.json`) so it's available in **all projects**
|
|
49
|
+
3. Merges required permissions into `settings.json`
|
|
50
|
+
|
|
51
|
+
**Supported runtimes:** Claude Code, OpenCode
|
|
52
|
+
|
|
53
|
+
**Install scope:** Global (`~/.claude/`, available in all projects) or Local (`./.claude/`, this project only)
|
|
54
|
+
|
|
55
|
+
### Requirements
|
|
56
|
+
|
|
57
|
+
- [Node.js](https://nodejs.org/) 18+
|
|
58
|
+
- [Claude Code](https://docs.anthropic.com/en/docs/claude-code) installed
|
|
59
|
+
|
|
60
|
+
### Playwright MCP (required for E2E)
|
|
61
|
+
|
|
62
|
+
QAA uses [`@playwright/mcp`](https://www.npmjs.com/package/@playwright/mcp) to open a real browser, extract locators from live pages, run E2E tests, and auto-fix locator mismatches.
|
|
63
|
+
|
|
64
|
+
**You need to install the Playwright MCP server manually in your environment:**
|
|
65
|
+
|
|
66
|
+
<details>
|
|
67
|
+
<summary><strong>VS Code (Claude Code extension)</strong></summary>
|
|
68
|
+
|
|
69
|
+
1. Open VS Code Settings (`Ctrl+Shift+P` > `Preferences: Open User Settings (JSON)`)
|
|
70
|
+
2. Add the MCP server config:
|
|
71
|
+
|
|
72
|
+
```json
|
|
73
|
+
{
|
|
74
|
+
"claude-code.mcpServers": {
|
|
75
|
+
"playwright": {
|
|
76
|
+
"command": "npx",
|
|
77
|
+
"args": ["@playwright/mcp@latest"]
|
|
78
|
+
}
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
Or add it to your project's `.vscode/mcp.json`:
|
|
84
|
+
|
|
85
|
+
```json
|
|
86
|
+
{
|
|
87
|
+
"servers": {
|
|
88
|
+
"playwright": {
|
|
89
|
+
"command": "npx",
|
|
90
|
+
"args": ["@playwright/mcp@latest"]
|
|
91
|
+
}
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
</details>
|
|
97
|
+
|
|
98
|
+
<details>
|
|
99
|
+
<summary><strong>Claude Code CLI</strong></summary>
|
|
100
|
+
|
|
101
|
+
Add to `~/.claude.json` (user-scope, all projects):
|
|
102
|
+
|
|
103
|
+
```json
|
|
104
|
+
{
|
|
105
|
+
"mcpServers": {
|
|
106
|
+
"playwright": {
|
|
107
|
+
"command": "npx",
|
|
108
|
+
"args": ["@playwright/mcp@latest"]
|
|
109
|
+
}
|
|
110
|
+
}
|
|
111
|
+
}
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
Or add a `.mcp.json` file in your project root for project-scope only.
|
|
115
|
+
|
|
116
|
+
</details>
|
|
117
|
+
|
|
118
|
+
Once configured, Playwright MCP enables QAA to:
|
|
119
|
+
- Open a real browser and navigate your running app
|
|
120
|
+
- Extract actual locators (`data-testid`, ARIA roles, labels) from live pages
|
|
121
|
+
- Run E2E tests, capture failures, and auto-fix locator mismatches
|
|
122
|
+
- Build a persistent **Locator Registry** (`.qa-output/locators/`) that caches real locators across features
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## Quick Start
|
|
127
|
+
|
|
128
|
+
### New project, no tests
|
|
129
|
+
|
|
130
|
+
```
|
|
131
|
+
/qa-start --dev-repo ./myproject --auto
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
Runs the full pipeline end-to-end: scan, map, analyze, plan, generate, validate, and deliver as a draft PR.
|
|
135
|
+
|
|
136
|
+
### Mature project, new feature
|
|
137
|
+
|
|
138
|
+
```
|
|
139
|
+
/qa-map # build the "brain" (once)
|
|
140
|
+
/qa-create-test "password reset" # generate tests using codebase knowledge
|
|
141
|
+
/qa-pr --ticket PROJ-123 "password reset tests" # ship as draft PR
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
### From a Jira ticket
|
|
145
|
+
|
|
146
|
+
```
|
|
147
|
+
/qa-from-ticket https://company.atlassian.net/browse/PROJ-456
|
|
148
|
+
/qa-pr --ticket PROJ-456 "login flow tests"
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
### Fix broken tests after a deploy
|
|
152
|
+
|
|
153
|
+
```
|
|
154
|
+
/qa-fix ./tests/e2e/checkout*
|
|
155
|
+
/qa-pr --ticket PROJ-789 "fix checkout tests"
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Commands
|
|
161
|
+
|
|
162
|
+
| Command | Purpose |
|
|
163
|
+
|---------|---------|
|
|
164
|
+
| `/qa-start` | Full pipeline end-to-end (scan through PR) |
|
|
165
|
+
| `/qa-map` | Deep codebase analysis with 4 parallel agents |
|
|
166
|
+
| `/qa-create-test <feature>` | Generate tests for a specific feature |
|
|
167
|
+
| `/qa-fix [path]` | Diagnose and fix broken tests |
|
|
168
|
+
| `/qa-audit [path]` | 6-dimension quality audit with scoring |
|
|
169
|
+
| `/qa-pr` | Create a draft pull request from QA artifacts |
|
|
170
|
+
| `/qa-testid [path]` | Inject `data-testid` attributes into components |
|
|
171
|
+
|
|
172
|
+
### Additional Commands
|
|
173
|
+
|
|
174
|
+
| Command | Purpose |
|
|
175
|
+
|---------|---------|
|
|
176
|
+
| `/qa-from-ticket <url>` | Generate tests from a Jira/Linear/GitHub Issue |
|
|
177
|
+
| `/qa-analyze` | Analyze a repo without generating tests |
|
|
178
|
+
| `/qa-validate [path]` | Validate test files against standards |
|
|
179
|
+
| `/qa-gap` | Find coverage gaps between dev and QA repos |
|
|
180
|
+
| `/qa-report` | Generate a QA status report |
|
|
181
|
+
| `/qa-audit` | Full quality audit with weighted scoring |
|
|
182
|
+
| `/qa-blueprint` | Generate QA repo structure from scratch |
|
|
183
|
+
| `/qa-research` | Research best testing stack for a project |
|
|
184
|
+
| `/qa-pom` | Generate Page Object Models |
|
|
185
|
+
| `/update-test` | Improve existing tests incrementally |
|
|
186
|
+
|
|
187
|
+
See [COMMANDS.md](docs/COMMANDS.md) for full usage details and flags.
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## Three Workflows
|
|
192
|
+
|
|
193
|
+
QAA adapts to the project's QA maturity:
|
|
194
|
+
|
|
195
|
+
**Option 1: No QA repo yet** — Full pipeline from scratch. Produces a complete test suite, repo blueprint, and draft PR.
|
|
196
|
+
|
|
197
|
+
```
|
|
198
|
+
/qa-start --dev-repo ./myproject
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
**Option 2: Immature QA repo** — Scans both repos, fixes broken tests, fills coverage gaps, standardizes existing tests.
|
|
202
|
+
|
|
203
|
+
```
|
|
204
|
+
/qa-start --dev-repo ./myproject --qa-repo ./tests
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
**Option 3: Mature QA repo** — Surgical additions only. Finds thin coverage areas and adds targeted tests without touching working code.
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
/qa-start --dev-repo ./myproject --qa-repo ./tests
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
---
|
|
214
|
+
|
|
215
|
+
## The "Brain" — Codebase Map
|
|
216
|
+
|
|
217
|
+
Before generating anything, QAA maps the codebase with 4 parallel agents producing 8 documents:
|
|
218
|
+
|
|
219
|
+
| Focus | Documents |
|
|
220
|
+
|-------|-----------|
|
|
221
|
+
| **Testability** | `TESTABILITY.md`, `TEST_SURFACE.md` — what's testable, entry points, mock boundaries |
|
|
222
|
+
| **Risk** | `RISK_MAP.md`, `CRITICAL_PATHS.md` — business-critical paths, security-sensitive areas |
|
|
223
|
+
| **Patterns** | `CODE_PATTERNS.md`, `API_CONTRACTS.md` — naming conventions, API shapes, import style |
|
|
224
|
+
| **Existing tests** | `TEST_ASSESSMENT.md`, `COVERAGE_GAPS.md` — current quality, frameworks, gaps |
|
|
225
|
+
|
|
226
|
+
Every downstream agent reads these documents. The result: generated tests feel native to the codebase, not generic boilerplate.
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
|
|
230
|
+
## Standards Enforced
|
|
231
|
+
|
|
232
|
+
Every generated artifact follows strict rules defined in [CLAUDE.md](CLAUDE.md):
|
|
233
|
+
|
|
234
|
+
### Testing Pyramid
|
|
235
|
+
|
|
236
|
+
```
|
|
237
|
+
/ E2E \ 3-5% (critical path smoke only)
|
|
238
|
+
/ API \ 20-25% (endpoints + contracts)
|
|
239
|
+
/ Integration\ 10-15% (component interactions)
|
|
240
|
+
/ Unit \ 60-70% (business logic, pure functions)
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
### Locator Hierarchy
|
|
244
|
+
|
|
245
|
+
1. **Tier 1 (Best):** `data-testid`, ARIA roles with accessible names
|
|
246
|
+
2. **Tier 2 (Good):** Form labels, placeholders, visible text
|
|
247
|
+
3. **Tier 3 (Acceptable):** Alt text, title attributes
|
|
248
|
+
4. **Tier 4 (Last Resort):** CSS selectors, XPath — always with a `// TODO` comment
|
|
249
|
+
|
|
250
|
+
### Page Object Model
|
|
251
|
+
|
|
252
|
+
- One class per page, no god objects
|
|
253
|
+
- No assertions in POMs — assertions belong in test specs
|
|
254
|
+
- Locators as readonly properties
|
|
255
|
+
- Every POM extends a shared `BasePage`
|
|
256
|
+
|
|
257
|
+
### Assertion Quality
|
|
258
|
+
|
|
259
|
+
```typescript
|
|
260
|
+
// Good — concrete values
|
|
261
|
+
expect(response.status).toBe(200);
|
|
262
|
+
expect(data.name).toBe('Test User');
|
|
263
|
+
|
|
264
|
+
// Bad — never do this
|
|
265
|
+
expect(response.status).toBeTruthy();
|
|
266
|
+
expect(data).toBeDefined();
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
### Test Case IDs
|
|
270
|
+
|
|
271
|
+
Every test case has a unique ID following the pattern:
|
|
272
|
+
- `UT-MODULE-001` — unit tests
|
|
273
|
+
- `INT-MODULE-001` — integration tests
|
|
274
|
+
- `API-RESOURCE-001` — API tests
|
|
275
|
+
- `E2E-FLOW-001` — E2E tests
|
|
276
|
+
|
|
277
|
+
---
|
|
278
|
+
|
|
279
|
+
## Validation
|
|
280
|
+
|
|
281
|
+
Generated tests pass through a 4-layer validation with auto-fix (up to 3 loops):
|
|
282
|
+
|
|
283
|
+
1. **Syntax** — does it parse? Are imports correct?
|
|
284
|
+
2. **Structure** — POM rules, file organization, naming conventions
|
|
285
|
+
3. **Dependencies** — all imports resolve, mocks set up correctly
|
|
286
|
+
4. **Logic** — assertions are concrete, locators follow tier hierarchy
|
|
287
|
+
|
|
288
|
+
If issues remain, the **Bug Detective** classifies each failure:
|
|
289
|
+
|
|
290
|
+
| Classification | Action |
|
|
291
|
+
|----------------|--------|
|
|
292
|
+
| `APPLICATION BUG` | Flagged for developer — not auto-fixed |
|
|
293
|
+
| `TEST CODE ERROR` | Auto-fixed at HIGH confidence |
|
|
294
|
+
| `ENVIRONMENT ISSUE` | Documented with setup instructions |
|
|
295
|
+
| `INCONCLUSIVE` | Flagged with evidence for manual review |
|
|
296
|
+
|
|
297
|
+
---
|
|
298
|
+
|
|
299
|
+
## Framework Support
|
|
300
|
+
|
|
301
|
+
QAA auto-detects the project's existing stack and matches it:
|
|
302
|
+
|
|
303
|
+
**Languages:** JavaScript/TypeScript, Python, Java, .NET/C#, Go, Ruby, PHP, Rust
|
|
304
|
+
|
|
305
|
+
**Test Frameworks:** Playwright, Cypress, Jest, Vitest, pytest, Selenium, and more
|
|
306
|
+
|
|
307
|
+
**Build Tools:** Vite, Next.js, Nuxt, Angular, Vue, Webpack, SvelteKit
|
|
308
|
+
|
|
309
|
+
**Git Platforms:** GitHub, Azure DevOps, GitLab
|
|
310
|
+
|
|
311
|
+
---
|
|
312
|
+
|
|
313
|
+
## Learning System
|
|
314
|
+
|
|
315
|
+
QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with `feature/`" — it saves the rule permanently to `MY_PREFERENCES.md`. Every agent reads your preferences before generating output.
|
|
316
|
+
|
|
317
|
+
Your team's conventions always win over defaults.
|
|
318
|
+
|
|
319
|
+
---
|
|
320
|
+
|
|
321
|
+
## Architecture
|
|
322
|
+
|
|
323
|
+
```
|
|
324
|
+
qaa-agent/
|
|
325
|
+
agents/ # 12 specialized QA agents
|
|
326
|
+
commands/ # 7 slash commands (user-facing entry points)
|
|
327
|
+
skills/ # 6 reusable skills
|
|
328
|
+
templates/ # 10 artifact templates (output format contracts)
|
|
329
|
+
workflows/ # 7 workflow orchestration specs
|
|
330
|
+
bin/ # Installer and CLI tools
|
|
331
|
+
docs/ # User documentation
|
|
332
|
+
CLAUDE.md # QA standards (read by every agent)
|
|
333
|
+
.mcp.json # Playwright MCP server config
|
|
334
|
+
settings.json # Claude Code permissions
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
### Agents
|
|
338
|
+
|
|
339
|
+
| Agent | Responsibility |
|
|
340
|
+
|-------|---------------|
|
|
341
|
+
| `qa-scanner` | Framework detection, file tree scanning |
|
|
342
|
+
| `qa-codebase-mapper` | 4-parallel-agent deep analysis |
|
|
343
|
+
| `qa-analyzer` | Risk assessment, test inventory, pyramid |
|
|
344
|
+
| `qa-planner` | Test case grouping, file assignment |
|
|
345
|
+
| `qa-executor` | Test file, POM, fixture generation |
|
|
346
|
+
| `qa-validator` | 4-layer validation with auto-fix |
|
|
347
|
+
| `qa-e2e-runner` | Browser-based test execution via Playwright MCP |
|
|
348
|
+
| `qa-bug-detective` | Failure classification with evidence |
|
|
349
|
+
| `qa-testid-injector` | `data-testid` attribute injection |
|
|
350
|
+
| `qa-project-researcher` | Testing stack research |
|
|
351
|
+
| `qa-discovery` | Project discovery |
|
|
352
|
+
| `qa-pipeline-orchestrator` | Pipeline coordination |
|
|
353
|
+
|
|
354
|
+
---
|
|
355
|
+
|
|
356
|
+
## Git Workflow
|
|
357
|
+
|
|
358
|
+
QAA follows strict git conventions:
|
|
359
|
+
|
|
360
|
+
- **Branch:** `qa/auto-{project}-{date}` (e.g., `qa/auto-shopflow-2026-03-18`)
|
|
361
|
+
- **Commits:** One per agent stage — `qa(scanner): produce SCAN_MANIFEST.md for shopflow`
|
|
362
|
+
- **PR:** Draft PR with analysis summary, test counts, coverage metrics, validation status
|
|
363
|
+
|
|
364
|
+
---
|
|
365
|
+
|
|
366
|
+
## Documentation
|
|
367
|
+
|
|
368
|
+
- [Commands Reference](docs/COMMANDS.md) — all commands with flags and examples
|
|
369
|
+
- [Demo & Walkthrough](docs/DEMO.md) — problem/solution explanation with real workflows
|
|
370
|
+
- [Changelog](CHANGELOG.md) — version history
|
|
371
|
+
|
|
372
|
+
---
|
|
373
|
+
|
|
374
|
+
## License
|
|
375
|
+
|
|
376
|
+
[MIT](LICENSE)
|
|
377
|
+
|
|
378
|
+
---
|
|
379
|
+
|
|
380
|
+
Built by [Capmation](https://github.com/capmation)
|
package/bin/install.cjs
CHANGED
|
@@ -143,22 +143,24 @@ async function main() {
|
|
|
143
143
|
copyFile(path.join(ROOT, 'CLAUDE.md'), path.join(qaaDir, 'CLAUDE.md'));
|
|
144
144
|
ok('Installed QA standards (CLAUDE.md)');
|
|
145
145
|
|
|
146
|
-
// Install .mcp.json (Playwright MCP server config)
|
|
146
|
+
// Install .mcp.json (Playwright MCP server config)
|
|
147
147
|
const mcpSrc = path.join(ROOT, '.mcp.json');
|
|
148
148
|
if (fs.existsSync(mcpSrc)) {
|
|
149
149
|
// Copy to qaa dir for reference
|
|
150
150
|
copyFile(mcpSrc, path.join(qaaDir, '.mcp.json'));
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
151
|
+
|
|
152
|
+
// Merge MCP servers into ~/.claude.json (user-scope) so they're available in ALL projects
|
|
153
|
+
// Note: ~/.claude/.mcp.json is project-scope for ~/.claude/ only — NOT global
|
|
154
|
+
const userConfigPath = path.join(HOME, '.claude.json');
|
|
155
|
+
let userConfig = {};
|
|
156
|
+
if (fs.existsSync(userConfigPath)) {
|
|
157
|
+
try { userConfig = JSON.parse(fs.readFileSync(userConfigPath, 'utf8')); } catch {}
|
|
157
158
|
}
|
|
159
|
+
userConfig.mcpServers = userConfig.mcpServers || {};
|
|
158
160
|
const qaaMcp = JSON.parse(fs.readFileSync(mcpSrc, 'utf8'));
|
|
159
|
-
Object.assign(
|
|
160
|
-
fs.writeFileSync(
|
|
161
|
-
ok('Installed Playwright MCP server config (
|
|
161
|
+
Object.assign(userConfig.mcpServers, qaaMcp.mcpServers);
|
|
162
|
+
fs.writeFileSync(userConfigPath, JSON.stringify(userConfig, null, 2));
|
|
163
|
+
ok('Installed Playwright MCP server config (user-scope — available in all projects)');
|
|
162
164
|
}
|
|
163
165
|
|
|
164
166
|
// Write version
|
package/package.json
CHANGED