npm - qaa-agent - Versions diffs - 1.6.2 → 1.7.0 - Mend

qaa-agent 1.6.2 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (78) hide show

package/.mcp.json +8 -8
package/CHANGELOG.md +93 -71
package/CLAUDE.md +553 -553
package/agents/qa-pipeline-orchestrator.md +1378 -1378
package/agents/qaa-analyzer.md +539 -524
package/agents/qaa-bug-detective.md +479 -446
package/agents/qaa-codebase-mapper.md +935 -935
package/agents/qaa-discovery.md +384 -0
package/agents/qaa-e2e-runner.md +416 -415
package/agents/qaa-executor.md +651 -651
package/agents/qaa-planner.md +405 -390
package/agents/qaa-project-researcher.md +319 -319
package/agents/qaa-scanner.md +424 -424
package/agents/qaa-testid-injector.md +643 -585
package/agents/qaa-validator.md +490 -452
package/bin/install.cjs +200 -198
package/bin/lib/commands.cjs +709 -709
package/bin/lib/config.cjs +307 -307
package/bin/lib/core.cjs +497 -497
package/bin/lib/frontmatter.cjs +299 -299
package/bin/lib/init.cjs +989 -989
package/bin/lib/milestone.cjs +241 -241
package/bin/lib/model-profiles.cjs +60 -60
package/bin/lib/phase.cjs +911 -911
package/bin/lib/roadmap.cjs +306 -306
package/bin/lib/state.cjs +748 -748
package/bin/lib/template.cjs +222 -222
package/bin/lib/verify.cjs +842 -842
package/bin/qaa-tools.cjs +607 -607
package/commands/qa-audit.md +119 -0
package/commands/qa-create-test.md +288 -0
package/commands/qa-fix.md +147 -0
package/commands/qa-map.md +137 -0
package/{.claude/commands → commands}/qa-pr.md +23 -23
package/{.claude/commands → commands}/qa-start.md +22 -22
package/{.claude/commands → commands}/qa-testid.md +19 -19
package/docs/COMMANDS.md +341 -341
package/docs/DEMO.md +182 -182
package/docs/TESTING.md +156 -156
package/package.json +6 -7
package/{.claude/settings.json → settings.json} +1 -2
package/templates/failure-classification.md +391 -391
package/templates/gap-analysis.md +409 -409
package/templates/pr-template.md +48 -48
package/templates/qa-analysis.md +381 -381
package/templates/qa-audit-report.md +465 -465
package/templates/qa-repo-blueprint.md +636 -636
package/templates/scan-manifest.md +312 -312
package/templates/test-inventory.md +582 -582
package/templates/testid-audit-report.md +354 -354
package/templates/validation-report.md +243 -243
package/workflows/qa-analyze.md +296 -296
package/workflows/qa-from-ticket.md +536 -536
package/workflows/qa-gap.md +309 -303
package/workflows/qa-pr.md +389 -389
package/workflows/qa-start.md +1192 -1168
package/workflows/qa-testid.md +384 -356
package/workflows/qa-validate.md +299 -295
package/.claude/commands/create-test.md +0 -164
package/.claude/commands/qa-audit.md +0 -37
package/.claude/commands/qa-blueprint.md +0 -54
package/.claude/commands/qa-fix.md +0 -36
package/.claude/commands/qa-from-ticket.md +0 -24
package/.claude/commands/qa-gap.md +0 -20
package/.claude/commands/qa-map.md +0 -47
package/.claude/commands/qa-pom.md +0 -36
package/.claude/commands/qa-pyramid.md +0 -37
package/.claude/commands/qa-report.md +0 -38
package/.claude/commands/qa-research.md +0 -33
package/.claude/commands/qa-validate.md +0 -42
package/.claude/commands/update-test.md +0 -58
package/.claude/skills/qa-learner/SKILL.md +0 -150
/package/{.claude/skills → skills}/qa-bug-detective/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-repo-analyzer/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-self-validator/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-template-engine/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-testid-injector/SKILL.md +0 -0
/package/{.claude/skills → skills}/qa-workflow-documenter/SKILL.md +0 -0

package/docs/DEMO.md CHANGED Viewed

@@ -1,182 +1,182 @@
-# QAA — QA Automation Agent
-## What is it?
-QAA is a multi-agent system that automates QA test creation for any software project. You point it at a codebase, and it analyzes the architecture, maps the code, generates a full test suite following industry standards, validates everything, and delivers the result as a draft pull request — ready for review.
-No manual test writing. No guessing what to cover. One command, full pipeline.
-## The Problem
-Writing test suites is slow, repetitive, and often inconsistent. Teams face:
-- **Starting from zero is painful** — a new project with no tests means weeks of setup before the first real test runs
-- **Coverage gaps are invisible** — without analysis, teams don't know what's missing until something breaks in production
-- **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming conventions
-- **QA is always behind dev** — features ship faster than tests get written, and the gap keeps growing
-- **Existing QA teams still spend hours on repetitive work** — even with a mature test suite, adding tests for new features means manually inspecting pages, finding locators, writing POMs, running tests, fixing failures, repeat
-## The Solution
-QAA runs a pipeline of specialized AI agents, each responsible for one stage:
-```
-scan → map → analyze → plan → generate → validate → deliver
-```
-| Stage | What happens | Output |
-|-------|-------------|--------|
-| **Scan** | Detects framework, language, testable surfaces | SCAN_MANIFEST.md |
-| **Map** | Deep-scans codebase for testability, risk, patterns, existing tests (4 parallel agents) | 8 codebase documents |
-| **Analyze** | Produces risk assessment, test inventory, testing pyramid | QA_ANALYSIS.md, TEST_INVENTORY.md |
-| **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | GENERATION_PLAN.md |
-| **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
-| **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | VALIDATION_REPORT.md |
-| **Deliver** | Creates branch, commits per stage, pushes, opens draft PR | Pull request URL |
-Every agent reads the project's QA standards (CLAUDE.md) before producing output. Every test case has a unique ID, concrete inputs, and explicit expected outcomes — never "works correctly."
-## Three Workflows
-QAA adapts to where the project is in its QA maturity:
-**1. No QA repo yet** — `/qa-start --dev-repo ./myproject`
-Full pipeline from scratch. Produces a complete test suite, QA repo blueprint, and a draft PR with everything.
-**2. Immature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
-Scans both repos, identifies gaps, fixes broken tests, adds missing coverage, standardizes existing tests.
-**3. Mature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
-Only adds surgical test additions where coverage is thin. Doesn't touch working tests.
-## The "Brain" — Codebase Map
-Before generating anything, QAA maps the entire codebase with 4 parallel agents:
-- **Testability** — what's testable, pure functions vs stateful code, mock boundaries
-- **Risk** — business-critical paths, security-sensitive areas, data integrity risks
-- **Patterns** — naming conventions, API shapes, import style, code patterns
-- **Existing tests** — current test quality, frameworks in use, coverage gaps
-These 8 documents become the shared context that every downstream agent reads. The analyzer uses risk data to prioritize tests. The planner uses testability data to estimate complexity. The executor uses code patterns to generate tests that match the project's style.
-Result: generated tests feel native to the codebase, not generic boilerplate.
-## Day-to-Day for a QA Engineer
-This is where QAA shines for teams that already have a mature QA repo. The full pipeline is for bootstrapping — but the real daily value is the targeted workflow.
-### The scenario
-You're a QA engineer. A developer just shipped a new "password reset" feature. You need tests. Here's what happens:
-### Step 1: Map the codebase (once)
-```
-/qa-map
-```
-QAA scans the entire project and builds its "brain" — 8 documents covering testability, risk areas, API contracts, code patterns, and existing test coverage. This runs once and stays valid until the codebase changes significantly.
-### Step 2: Create tests for the feature
-```
-/create-test "password reset"
-```
-QAA already knows the codebase. It reads the brain documents, finds the relevant source files (`auth.service.ts`, `reset.controller.ts`, the reset page component), understands the API contracts, and generates:
-- Unit tests for the reset token logic with concrete inputs and expected outputs
-- API tests for `POST /api/auth/reset-password` with real request/response shapes
-- E2E tests with Page Object Models that use the project's existing POM base class
-- Fixtures with test data (fake emails, expired tokens, invalid tokens)
-All following the project's naming conventions, import style, and assertion patterns.
-### Step 3: Validate and fix in a loop
-```
-/qa-validate ./tests
-```
-The validator runs 4 layers of checks on every generated file:
-1. **Syntax** — does it parse? Are imports correct?
-2. **Structure** — does it follow POM rules? Are locators in the right tier?
-3. **Dependencies** — do all imports resolve? Are mocks set up correctly?
-4. **Logic** — are assertions concrete? Do test IDs follow the convention?
-If issues are found, the validator auto-fixes them and re-checks — up to 3 loops. If something still fails, the bug detective classifies it: is it an application bug, a test code error, or an environment issue?
-### Step 4: Run the tests with Playwright
-QAA integrates with Playwright to actually execute the generated E2E tests against a running application. It opens the browser, navigates pages, fills forms, clicks buttons, and captures what happens. If a test fails, it reads the error, inspects the page state, and determines whether the locator is wrong, the page changed, or there's a real bug.
-The loop looks like this:
-```
-generate → validate → run → failures? → classify → fix test code → run again → pass
-```
-This continues until the tests pass or the issue is classified as an application bug that needs a developer fix.
-### Step 5: Ship it
-```
-/qa-pr --ticket PROJ-456 "password reset tests"
-```
-QAA creates a branch following your team's naming convention (it asked you once and remembers forever), commits the test files, pushes, and opens a draft PR on GitHub, Azure DevOps, or GitLab — whatever your team uses. You get the link.
-### The full daily flow
-```
-/qa-map                                          → builds the "brain" (once)
-/create-test "password reset"                    → generates tests using codebase knowledge
-/qa-validate ./tests/unit/auth*                  → validates + auto-fixes
-/qa-pr --ticket PROJ-456 "password reset tests"  → draft PR with link
-```
-From ticket to PR in minutes, not hours. And the tests follow the same standards as every other test in the repo because QAA read the existing patterns first.
-### What about tickets?
-If you work from Jira, Linear, or GitHub Issues, skip the manual description:
-```
-/qa-from-ticket https://company.atlassian.net/browse/PROJ-456
-```
-QAA fetches the ticket, extracts acceptance criteria and edge cases, maps each criterion to test cases with a traceability matrix, generates the tests, validates them, and gives you a report showing which AC is covered by which test.
-### When tests break after a deploy
-```
-/qa-fix ./tests/e2e/checkout*
-```
-QAA reads the failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes the test code errors. Application bugs get flagged for the dev team with evidence — the exact assertion that failed, what was expected, and what was received.
-## Standards
-Every test artifact follows strict rules:
-- **Testing pyramid** — 60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E
-- **Locator hierarchy** — data-testid first, ARIA roles, labels, CSS as last resort (with TODO)
-- **Page Object Model** — one class per page, no assertions in POMs, locators as properties
-- **Assertions** — concrete values only. `expect(status).toBe(200)` not `expect(status).toBeTruthy()`
-- **Naming** — unique IDs per test case: `UT-AUTH-001`, `API-USERS-003`, `E2E-CHECKOUT-001`
-## Learning System
-QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with feature/" — it saves the rule permanently. Next time, every agent reads your preferences before generating output.
-Preferences override defaults. Your team's conventions always win.
-## Numbers
-17 commands. 7 skills. 11 agents. 10 templates. 7 workflows.
-Supports GitHub, Azure DevOps, and GitLab. Works with Playwright, Cypress, Jest, Vitest, pytest, and more — detects what the project uses and matches it.
-One goal: you focus on building features, QAA handles the tests.
+# QAA — QA Automation Agent
+## What is it?
+QAA is a multi-agent system that automates QA test creation for any software project. You point it at a codebase, and it analyzes the architecture, maps the code, generates a full test suite following industry standards, validates everything, and delivers the result as a draft pull request — ready for review.
+No manual test writing. No guessing what to cover. One command, full pipeline.
+## The Problem
+Writing test suites is slow, repetitive, and often inconsistent. Teams face:
+- **Starting from zero is painful** — a new project with no tests means weeks of setup before the first real test runs
+- **Coverage gaps are invisible** — without analysis, teams don't know what's missing until something breaks in production
+- **Standards drift** — different team members write tests differently: inconsistent locators, vague assertions, mixed naming conventions
+- **QA is always behind dev** — features ship faster than tests get written, and the gap keeps growing
+- **Existing QA teams still spend hours on repetitive work** — even with a mature test suite, adding tests for new features means manually inspecting pages, finding locators, writing POMs, running tests, fixing failures, repeat
+## The Solution
+QAA runs a pipeline of specialized AI agents, each responsible for one stage:
+```
+scan → map → analyze → plan → generate → validate → deliver
+```
+| Stage | What happens | Output |
+|-------|-------------|--------|
+| **Scan** | Detects framework, language, testable surfaces | SCAN_MANIFEST.md |
+| **Map** | Deep-scans codebase for testability, risk, patterns, existing tests (4 parallel agents) | 8 codebase documents |
+| **Analyze** | Produces risk assessment, test inventory, testing pyramid | QA_ANALYSIS.md, TEST_INVENTORY.md |
+| **Plan** | Groups test cases by feature, assigns to files, resolves dependencies | GENERATION_PLAN.md |
+| **Generate** | Writes test files, POMs, fixtures, configs following project standards | Test suite on disk |
+| **Validate** | 4-layer validation (syntax, structure, dependencies, logic) with auto-fix | VALIDATION_REPORT.md |
+| **Deliver** | Creates branch, commits per stage, pushes, opens draft PR | Pull request URL |
+Every agent reads the project's QA standards (CLAUDE.md) before producing output. Every test case has a unique ID, concrete inputs, and explicit expected outcomes — never "works correctly."
+## Three Workflows
+QAA adapts to where the project is in its QA maturity:
+**1. No QA repo yet** — `/qa-start --dev-repo ./myproject`
+Full pipeline from scratch. Produces a complete test suite, QA repo blueprint, and a draft PR with everything.
+**2. Immature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
+Scans both repos, identifies gaps, fixes broken tests, adds missing coverage, standardizes existing tests.
+**3. Mature QA repo** — `/qa-start --dev-repo ./myproject --qa-repo ./tests`
+Only adds surgical test additions where coverage is thin. Doesn't touch working tests.
+## The "Brain" — Codebase Map
+Before generating anything, QAA maps the entire codebase with 4 parallel agents:
+- **Testability** — what's testable, pure functions vs stateful code, mock boundaries
+- **Risk** — business-critical paths, security-sensitive areas, data integrity risks
+- **Patterns** — naming conventions, API shapes, import style, code patterns
+- **Existing tests** — current test quality, frameworks in use, coverage gaps
+These 8 documents become the shared context that every downstream agent reads. The analyzer uses risk data to prioritize tests. The planner uses testability data to estimate complexity. The executor uses code patterns to generate tests that match the project's style.
+Result: generated tests feel native to the codebase, not generic boilerplate.
+## Day-to-Day for a QA Engineer
+This is where QAA shines for teams that already have a mature QA repo. The full pipeline is for bootstrapping — but the real daily value is the targeted workflow.
+### The scenario
+You're a QA engineer. A developer just shipped a new "password reset" feature. You need tests. Here's what happens:
+### Step 1: Map the codebase (once)
+```
+/qa-map
+```
+QAA scans the entire project and builds its "brain" — 8 documents covering testability, risk areas, API contracts, code patterns, and existing test coverage. This runs once and stays valid until the codebase changes significantly.
+### Step 2: Create tests for the feature
+```
+/create-test "password reset"
+```
+QAA already knows the codebase. It reads the brain documents, finds the relevant source files (`auth.service.ts`, `reset.controller.ts`, the reset page component), understands the API contracts, and generates:
+- Unit tests for the reset token logic with concrete inputs and expected outputs
+- API tests for `POST /api/auth/reset-password` with real request/response shapes
+- E2E tests with Page Object Models that use the project's existing POM base class
+- Fixtures with test data (fake emails, expired tokens, invalid tokens)
+All following the project's naming conventions, import style, and assertion patterns.
+### Step 3: Validate and fix in a loop
+```
+/qa-validate ./tests
+```
+The validator runs 4 layers of checks on every generated file:
+1. **Syntax** — does it parse? Are imports correct?
+2. **Structure** — does it follow POM rules? Are locators in the right tier?
+3. **Dependencies** — do all imports resolve? Are mocks set up correctly?
+4. **Logic** — are assertions concrete? Do test IDs follow the convention?
+If issues are found, the validator auto-fixes them and re-checks — up to 3 loops. If something still fails, the bug detective classifies it: is it an application bug, a test code error, or an environment issue?
+### Step 4: Run the tests with Playwright
+QAA integrates with Playwright to actually execute the generated E2E tests against a running application. It opens the browser, navigates pages, fills forms, clicks buttons, and captures what happens. If a test fails, it reads the error, inspects the page state, and determines whether the locator is wrong, the page changed, or there's a real bug.
+The loop looks like this:
+```
+generate → validate → run → failures? → classify → fix test code → run again → pass
+```
+This continues until the tests pass or the issue is classified as an application bug that needs a developer fix.
+### Step 5: Ship it
+```
+/qa-pr --ticket PROJ-456 "password reset tests"
+```
+QAA creates a branch following your team's naming convention (it asked you once and remembers forever), commits the test files, pushes, and opens a draft PR on GitHub, Azure DevOps, or GitLab — whatever your team uses. You get the link.
+### The full daily flow
+```
+/qa-map                                          → builds the "brain" (once)
+/create-test "password reset"                    → generates tests using codebase knowledge
+/qa-validate ./tests/unit/auth*                  → validates + auto-fixes
+/qa-pr --ticket PROJ-456 "password reset tests"  → draft PR with link
+```
+From ticket to PR in minutes, not hours. And the tests follow the same standards as every other test in the repo because QAA read the existing patterns first.
+### What about tickets?
+If you work from Jira, Linear, or GitHub Issues, skip the manual description:
+```
+/qa-from-ticket https://company.atlassian.net/browse/PROJ-456
+```
+QAA fetches the ticket, extracts acceptance criteria and edge cases, maps each criterion to test cases with a traceability matrix, generates the tests, validates them, and gives you a report showing which AC is covered by which test.
+### When tests break after a deploy
+```
+/qa-fix ./tests/e2e/checkout*
+```
+QAA reads the failing tests, runs them, classifies each failure (app bug vs test code error vs environment issue), and auto-fixes the test code errors. Application bugs get flagged for the dev team with evidence — the exact assertion that failed, what was expected, and what was received.
+## Standards
+Every test artifact follows strict rules:
+- **Testing pyramid** — 60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E
+- **Locator hierarchy** — data-testid first, ARIA roles, labels, CSS as last resort (with TODO)
+- **Page Object Model** — one class per page, no assertions in POMs, locators as properties
+- **Assertions** — concrete values only. `expect(status).toBe(200)` not `expect(status).toBeTruthy()`
+- **Naming** — unique IDs per test case: `UT-AUTH-001`, `API-USERS-003`, `E2E-CHECKOUT-001`
+## Learning System
+QAA remembers your preferences across sessions. When you correct it — "use Playwright, not Cypress" or "our branches start with feature/" — it saves the rule permanently. Next time, every agent reads your preferences before generating output.
+Preferences override defaults. Your team's conventions always win.
+## Numbers
+17 commands. 7 skills. 11 agents. 10 templates. 7 workflows.
+Supports GitHub, Azure DevOps, and GitLab. Works with Playwright, Cypress, Jest, Vitest, pytest, and more — detects what the project uses and matches it.
+One goal: you focus on building features, QAA handles the tests.