npm - qaa-agent - Versions diffs - 1.9.1 → 1.9.2 - Mend

qaa-agent 1.9.1 → 1.9.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/CHANGELOG.md +180 -186
package/CLAUDE.md +557 -557
package/README.md +1 -1
package/VERSION +1 -0
package/agents/qa-pipeline-orchestrator.md +1424 -1424
package/agents/qaa-bug-detective.md +654 -630
package/agents/qaa-e2e-runner.md +576 -552
package/agents/qaa-executor.md +829 -805
package/agents/qaa-project-researcher.md +400 -339
package/package.json +3 -2
package/workflows/qa-start.md +1405 -1262

package/CHANGELOG.md CHANGED Viewed

@@ -1,186 +1,180 @@
-# Changelog
-All notable changes to QAA (QA Automation Agent) are documented here.
-## [1.9.1] - 2026-04-27
-### Added
-- **`/qa-test-report` command** — generates a per-ticket QA execution summary and appends it to the Azure DevOps work item's `Custom.QATestCasesReport` field.
-  - Resolves the work item and its linked test cases (TestedBy-Forward relations)
-  - Pulls test case execution status via REST `/_apis/test/points` (using `ADO_MCP_AUTH_TOKEN` env var as PAT, Basic auth — same pattern as `/qa-create-test --ado`), or falls back to a manual prompt when no run result exists or the token is not set
-  - Renders a markdown report in chat (for review) and an HTML report appended to the ADO field — preserving prior content with a blank-line separator (no local file is written)
-  - Smoke-tested end-to-end (manual mode, all passed) — field write and ADO render verified
-## [1.9.0] - 2026-04-24
-### Added
-- **`/qa-research` command** — new standalone command that invokes the `qaa-project-researcher` agent to investigate the testing ecosystem for the current project. Supports 4 modes via `--focus` flag: `stack-testing` (default, full stack analysis), `framework-deep-dive` (deep dive into detected framework), `api-testing` (API testing strategy), `e2e-strategy` (E2E strategy). Produces research documents in `.qa-output/research/` consumed by all downstream agents.
-- **Research stage in `/qa-start` pipeline** — the full pipeline now runs `scan → research → codebase-map → analyze → ...`. The researcher agent runs after the scanner, using Context7 MCP as the primary source for framework documentation. Research is non-blocking: if it fails, downstream agents fall back to querying Context7 directly.
-- **Context7 MCP mandatory in 3 agents** — `qaa-executor`, `qaa-e2e-runner`, and `qaa-bug-detective` now have a non-negotiable "Framework Verification via Context7" section. Before generating code, fixing locators, or auto-fixing test errors, these agents MUST query Context7 for the framework's current API and syntax. This applies to ALL frameworks including well-known ones (Playwright, Cypress, Jest) — training data may be outdated.
-- **Research documents propagated across commands** — `/qa-create-test`, `/qa-fix`, and `/qa-start` now pass research documents (`FRAMEWORK_CAPABILITIES.md`, `TESTING_STACK.md`, `E2E_STRATEGY.md`, `API_TESTING_STRATEGY.md`) to executor, e2e-runner, and bug-detective agents via `files_to_read`. Documents are optional — if they don't exist, agents query Context7 directly.
-- **Context7 verification in quality gate checklists** — executor, e2e-runner, and bug-detective now include Context7-specific items in their verification checklists (e.g., "Context7 was queried for selector API before generating POM files").
-### Changed
-- **`qaa-project-researcher` agent updated** — Context7 MCP tools (`resolve-library-id`, `query-docs`) now declared explicitly in frontmatter. Output path changed from `specs/research/` to `.qa-output/research/`. Added version awareness section: detects project's current framework version, generates all syntax for that version, and includes an informational note about newer versions (without recommending upgrade).
-- **Pipeline diagram updated** — all pipeline references across CLAUDE.md, README.md, and workflows now include the `research` stage: `scan → research → codebase-map → analyze → ...`.
-- **Module boundaries and dependency ordering updated** — CLAUDE.md now includes `qa-project-researcher` in the module boundaries table, stage transitions table, dependency ordering, and read-before-write rules.
-### Fixed
-- **`qaa-project-researcher` was an orphaned agent** — the agent existed since v1.2.0 but was never invoked by any command, workflow, or the pipeline orchestrator. The `/qa-research` command documented in `docs/COMMANDS.md` never had a corresponding file in `/commands/`. No downstream agent read its output files. This release connects the researcher to the pipeline and creates the missing command.
-## [1.8.6] - 2026-04-20
-### Added
-- **Fix mode analyze-first flow in `/qa-fix`** — fix mode now runs in two phases: Phase 1 analyzes and classifies all failures without touching any files, then presents a Fix Plan to the user. Phase 2 executes auto-fixes only after explicit user confirmation. Users can refine the plan iteratively (add fixes, remove files, change approach) in a loop until satisfied, then approve or cancel. Replaces the previous fully-automatic fix behavior.
-- **Codebase map context for bug-detective** — `/qa-fix` fix mode now passes all 4 codebase map documents (`CODE_PATTERNS.md`, `API_CONTRACTS.md`, `TEST_SURFACE.md`, `TESTABILITY.md`) to the bug-detective agent via `files_to_read`. Previously the bug-detective classified failures without project-specific context, leading to less accurate classifications.
-- **Mandatory bash checklist in `/qa-fix`** — verification block at the end of `qa-fix.md` that forces the agent to run `ls`/`cat`/`grep` commands to confirm artifacts were produced (classification report, locator registry, codebase map, MCP evidence, test files). Matches the existing checklist pattern in `/qa-create-test`.
-### Changed
-- **Validator fix loops increased from 3 to 5** — `qaa-validator` agent now has up to 5 fix loop iterations (previously 3), matching the E2E runner's loop budget. Updated across all references: locked decision, fix loop logic, checkpoint return, confidence criteria table, and quality gate checks.
-## [1.8.5] - 2026-04-17
-### Added
-- **Azure DevOps mode in `/qa-create-test`** — new `--ado` flag enables creating Test Cases directly in Azure DevOps from a work item. Supports work item ID or full ADO URL, auto-detects `dev.azure.com` and `*.visualstudio.com` URLs. Features include: boundary value triplet detection (N-1, N, N+1), deduplication against existing linked TCs, confidence scoring (Specified vs Draft), keyword-based Critical tagging, and preconditions block per test case.
-- **`/qa-create-test-ado` standalone command** — dedicated command for Azure DevOps test case creation with 7-phase workflow: retrieve work item with comments/attachments, dedup check, type-based content extraction (Bug → Repro Steps, User Story → Acceptance Criteria), test case design, creation in ADO via `testplan_create_test_case`, structured report generation, and report attachment to source work item.
-- **ADO-specific flags** — `--area-path`, `--iteration-path` (override paths for created TCs), `--skip-dedup` (skip deduplication check).
-### Changed
-- **`/qa-create-test` now supports 5 modes** — from-code, from-ticket, ADO, update, and POM-only (previously 3 modes). Mode detection updated to recognize ADO URLs before ticket URLs to avoid routing conflicts.
-## [1.8.1] - 2026-04-16
-### Added
-- **Context7 MCP integration** — `@upstash/context7-mcp` is now bundled alongside Playwright MCP. The installer registers both MCP servers in the user-scope config (`~/.claude.json`) so they're available in every project on the machine, not just in the QAA repo. Context7 gives every QAA agent on-demand access to up-to-date library documentation (Playwright, Cypress, Jest, Vitest, pytest, and any other framework), keeping generated tests aligned with current APIs instead of outdated training data.
-- **`bin/install.cjs` installer script** — the file was referenced in `package.json` but didn't actually exist on npm, causing `npx qaa-agent` to fail silently (`No bin file found at bin/install.cjs`). The installer now performs three steps on every run: (1) copies agents, commands, skills, templates, workflows, docs, and config files into the chosen scope (`~/.claude/qaa` for global, `./.claude/qaa` for local), (2) registers both MCP servers in `~/.claude.json` with idempotency — existing entries are not duplicated, and (3) deep-merges the QAA permissions into the user's `settings.json` without overwriting their existing settings.
-### Changed
-- **MCP registration is now user-scope by default** — previously MCPs were defined only in the project-level `.mcp.json`, which meant they only activated when the user opened the QAA repo itself. They now register in `~/.claude.json`, making them available in every Claude Code project on the user's machine. The project-level `.mcp.json` is kept for QAA development purposes but is no longer the source of truth for end users.
-### Fixed
-- **Silent `npx qaa-agent` failure** — users who installed QAA via npm before this release did not get Playwright or Context7 MCPs registered because the installer script was missing from the published package. Publishing 1.8.1 restores the expected behavior: a single `npx qaa-agent` command copies all files and registers both MCPs globally.
-## [1.8.0] - 2026-04-13
-### Added
-- **Active verification checklist in every agent** — all 8 pipeline agents now end their body with a `## Before completing any task, verify each item actively:` section that forces the agent to run real `ls` + `cat` + `grep` commands against `.qa-output/` artifacts, the Locator Registry, codebase map documents, and `MY_PREFERENCES.md` before closing the task. The output of those commands lands in the subagent's context (recency effect), so the model cannot skip reading inputs or leave outputs unwritten without the verification failing.
-- **`skills:` declared in YAML frontmatter for every agent** — `qaa-analyzer`, `qaa-planner`, `qaa-executor`, `qaa-validator`, `qaa-e2e-runner`, `qaa-bug-detective`, `qaa-testid-injector`, `qaa-codebase-mapper`, `qa-pipeline-orchestrator`. Claude Code now injects the matching SKILL.md content at the start of the subagent's context when the Task tool spawns it. Previously subagents spawned with empty context and ignored the skill entirely.
-- **Non-negotiable rules section in `qaa-bug-detective`** — explicit rules for Locator Registry persistence and MY_PREFERENCES.md updates, placed mid-body as redundant reinforcement between the frontmatter (start) and the active checklist (end).
-- **`MY_PREFERENCES.md` reads propagated across slash commands** — `qa-create-test`, `qa-fix`, `qa-audit`, `qa-map` now pass `~/.claude/qaa/MY_PREFERENCES.md` to every spawned agent via `files_to_read`.
-- **Locator Registry reads propagated** — `qa-fix` and `qa-audit` now pass `.qa-output/locators/` to bug-detective, e2e-runner, and validator subagents.
-- **Playwright MCP usage is now non-negotiable in 4 agents** — `qaa-e2e-runner`, `qaa-testid-injector`, `qaa-bug-detective`, `qaa-executor` now hardcode Non-negotiable rules in their body that make live browser interaction via Playwright MCP **mandatory** (not optional) under the appropriate conditions. Previously agents sometimes skipped MCP calls even when the skill described them, because the description was advisory rather than enforced.
-- **MCP evidence files** at `.qa-output/mcp-evidence/{agent-name}-session.md` — every MCP-using agent now writes a structured evidence file per session logging `session_start`, `session_end`, URLs navigated, snapshots/screenshots taken, interactions performed, and `browser_closed: true`. The active verification checklist at the end of each agent runs `ls` + `grep` on this evidence file; missing or empty file = invalid run = hard failure.
-- **Skip-reason tracking** — when MCP is legitimately skipped (no `app_url`, non-E2E failure, MCP not connected), agents must document the skip reason in their primary report (TESTID_AUDIT_REPORT.md / FAILURE_CLASSIFICATION_REPORT.md). Silent skips are no longer permitted.
-- **Locator resolution priority chain — invention is forbidden** — `qaa-executor`, `qaa-e2e-runner`, and `qaa-bug-detective` now enforce a strict priority order when writing any locator: (1) Locator Registry first, (2) frontend source code `grep` second, (3) Playwright MCP live DOM snapshot third, (4) HALT if nothing resolvable. Agents MUST NOT invent `data-testid` values or guess CSS selectors. Every locator written to a generated file requires `source: registry | codebase | mcp` attribution in the MCP evidence file — anything else triggers file deletion or revert.
-- **Priority hit counts logged** — MCP evidence files now track `priority1_hits` (registry reuse), `priority2_hits` (source extraction), `priority3_hits` (MCP discovery), and `priority4_halts` (unresolvable elements), giving a full audit trail of where every locator came from.
-### Changed
-- **Agent reliability pattern: triple reinforcement** — every critical rule is now reinforced three times: (1) `skills:` frontmatter injection at the start of context, (2) `required_reading` + mid-body non-negotiable rules, (3) active `ls`/`cat`/`grep` verification at the end. This closes the "lost in the middle" attention gap documented in long-context LLM research.
-- **`qaa-bug-detective`, `qaa-executor`, `qaa-e2e-runner`, `qaa-validator`** — existing active checklists extended with `.qa-output/` specific items (generation plan, test inventory, codebase map, validation layers, failure classification evidence).
-### Fixed
-- **Subagent skill loss** — when a parent agent spawned a subagent via `Task()`, the subagent ran with fresh context and ignored the skill entirely (it had no way to know a skill existed). Declaring `skills:` in the YAML frontmatter fixes this at the Claude Code loader level.
-- **Artifact-read drift** — agents would sometimes reference `.qa-output/` artifacts in their reasoning without actually reading them. The active `grep` on specific content (e.g. "RISK_MAP HIGH items", "VALIDATION_REPORT confidence level") forces real consumption.
-## [1.7.0] - 2026-04-02
-### Added
-- **qaa-testid-injector**: Playwright MCP integration for live DOM verification before injection, codebase map reading (CODE_PATTERNS, TEST_SURFACE, TESTABILITY), and locator registry cross-referencing
-- **qaa-validator**: codebase map reading (CODE_PATTERNS, TEST_SURFACE, API_CONTRACTS) for structure and logic validation, locator registry cross-check for POM accuracy
-- **qaa-planner**: locator registry reading to assess E2E feasibility and improve complexity estimation
-- **qaa-analyzer**: locator registry reading to inform risk assessment and testing pyramid recommendations
-- **qaa-e2e-runner**: locator registry update after execution -- all discovered real locators are persisted
-- **qa-validate workflow**: now passes codebase map and locator registry to validator agent
-- **qa-gap workflow**: now passes codebase map and locator registry to analyzer agent
-- **qa-testid workflow**: now passes codebase map, locator registry, and app_url to injector agent
-### Changed
-- **E2E runner max fix loops: 3 → 5** -- more attempts to fix locator/assertion mismatches before giving up
-- **Installer**: updated paths for new package structure (commands/ and skills/ at root level), updated command list to reflect 7 consolidated commands
-- **Package structure**: commands and skills now live at package root instead of `.claude/` subdirectory
-- **Repository**: moved to new GitHub organization
-### Consolidated
-- 7 slash commands: `/qa-start`, `/qa-create-test`, `/qa-map`, `/qa-testid`, `/qa-pr`, `/qa-audit`, `/qa-fix`
-- Removed standalone `/qa-analyze`, `/qa-validate`, `/qa-gap` -- integrated into other commands
-## [1.6.0] - 2026-03-25
-### Added
-- Playwright MCP server bundled in agent package (`.mcp.json`) -- starts automatically when opening project in Claude Code
-- Persistent locator registry at `.qa-output/locators/` -- accumulates real locators across features over time
-  - Per-feature files: `{feature}.locators.md` -- extracted locators for each feature tested
-  - Central index: `LOCATOR_REGISTRY.md` -- all locators by page, searchable by any command
-- Browser-based locator extraction step in `/create-test` and `/qa-from-ticket` -- navigates live app with Playwright MCP and captures real data-testid, ARIA roles, and labels before generating tests
-- Registry cache: if locators for a feature already exist in the registry, browser extraction is skipped (reuses cached locators)
-- `--app-url` flag added to `/qa-from-ticket`
-- CHANGELOG.md
-### Changed
-- `qaa-executor` now reads locator registry (when available) to use real locators in POMs instead of proposing them
-- `/create-test` flow: checks registry first, then extracts via browser if needed, BEFORE test generation
-- `/qa-from-ticket` workflow: locator extraction step added after source scan, before test case generation
-### Removed
-- `/qa-analyze` command (deprecated since v1.4.0, fully replaced by `/qa-map`)
-## [1.5.0] - 2026-03-24
-### Added
-- Stable release
-## [1.4.0]
-### Changed
-- Merged `/qa-analyze` into `/qa-map` -- single command for codebase scanning and analysis
-- Consolidated pipeline flow
-### Deprecated
-- `/qa-analyze` command (use `/qa-map` instead)
-## [1.3.0]
-### Added
-- `qa-learner` skill -- persistent preferences from user corrections
-- Preferences saved to `~/.claude/qaa/MY_PREFERENCES.md`
-- Trigger detection for English and Spanish frustration signals
-## [1.2.0]
-### Added
-- `qaa-codebase-mapper` agent -- 4 parallel focus areas (testability, risk, patterns, existing tests)
-- `qaa-project-researcher` agent -- researches best testing stack and practices
-- 8 codebase map documents produced by mapper
-## [1.1.0]
-### Added
-- Workflow definitions for all pipeline stages
-- Interactive installer (`npx qaa-agent`)
-- `qaa init` command for per-project initialization
-- npm package distribution
-## [1.0.0]
-### Added
-- Full QA automation pipeline -- 11 agents, 17 commands, 10 templates, 7 workflows
-- 3 workflow options (dev-only, immature QA, mature QA)
-- 4-layer test validation (syntax, structure, dependencies, logic)
-- Page Object Model generation with CLAUDE.md standards
-- Test ID injection for frontend components
-- Bug detective failure classification
-- Draft PR delivery with branch naming convention
+# Changelog
+All notable changes to QAA (QA Automation Agent) are documented here.
+## [1.9.2] - 2026-05-05
+### Fixed
+- **`qaa-codebase-mapper` was missing from `/qa-start` pipeline since v1.6.3** — the codebase-mapper agent existed and was documented as part of the pipeline (CLAUDE.md, README.md, qa-pipeline-orchestrator.md), but the `workflows/qa-start.md` file never invoked it as a step. Downstream agents (analyzer, planner, executor) had `(if exists)` references to the 8 codebase docs that always failed because no agent produced them. The fix adds a new `<step name="codebase_map">` between scan and research that spawns 4 parallel sub-agents (testability, risk, patterns, existing-tests) producing the 8 documents in `.qa-output/codebase/`. Pipeline order now matches documentation: `scan → codebase-map → research → analyze → ...`.
+- **Pipeline order corrected: `research` now runs AFTER `codebase-map`** — research stage was previously placed before codebase-map (added in v1.9.0 as STAGE 1b). The correct logical order is `scan → codebase-map → research` so the researcher can use codebase findings (RISK_MAP, CODE_PATTERNS, API_CONTRACTS) to make Context7 queries specific to the project's actual stack instead of generic framework questions. Research stage renamed to STAGE 1c.
+- **Downstream agents now receive the appropriate codebase docs in their `<files_to_read>`** — analyzer reads RISK_MAP, CRITICAL_PATHS, TEST_ASSESSMENT, COVERAGE_GAPS; planner reads TESTABILITY, TEST_SURFACE, CRITICAL_PATHS, COVERAGE_GAPS; executor reads TEST_SURFACE, CODE_PATTERNS, API_CONTRACTS. Mapping follows the codebase-mapper's own `why_this_matters` contract.
+- **`qaa-executor`, `qaa-e2e-runner`, and `qaa-bug-detective` now declare Context7 tools explicitly in frontmatter** — these agents had prompts saying "MUST query Context7" but did not declare the `mcp__context7__resolve-library-id` and `mcp__context7__query-docs` tools in their YAML frontmatter, relying on tool inheritance from the parent. Now declared explicitly along with Playwright MCP tools where applicable. Improves portability and respects the principle of least privilege.
+- **`VERSION` file now distributed in npm package** — added `"VERSION"` to the `files` array in `package.json` so the `VERSION` file is included in published packages. Previously, end users who ran `npx qaa-agent` never received an updated `VERSION` file because it was excluded from the npm package whitelist, leaving stale version numbers that did not match `package.json`.
+### Added
+- **Version-aware `libraryId` in Context7 queries** — when the project's framework version is detected from `package.json`/`requirements.txt`/lock files, agents now use a versioned library ID format (`/org/project/v1.40.0`) in `query-docs` calls. Context7 returns documentation specific to that version, not the latest. Generated code matches the framework version the project actually uses, avoiding APIs that don't exist or have changed. Applied to `qaa-project-researcher`, `qaa-executor`, `qaa-e2e-runner`, and `qaa-bug-detective`. Falls back to base library ID if version cannot be detected.
+- **`<codebase_grounding>` section in `qaa-project-researcher`** — explicit instructions for the researcher to use codebase map findings (RISK_MAP, CODE_PATTERNS, API_CONTRACTS, etc.) to ground Context7 queries to project specifics (e.g., "MSW integration with Playwright" instead of "Playwright mocking capabilities"). Includes a comparison table of generic vs grounded queries.
+## [1.9.1] - 2026-04-27
+### Added
+- **`/qa-test-report` command** — generates a per-ticket QA execution summary and appends it to the Azure DevOps work item's `Custom.QATestCasesReport` field.
+  - Resolves the work item and its linked test cases (TestedBy-Forward relations)
+  - Pulls test case execution status via REST `/_apis/test/points` (using `ADO_MCP_AUTH_TOKEN` env var as PAT, Basic auth — same pattern as `/qa-create-test --ado`), or falls back to a manual prompt when no run result exists or the token is not set
+  - Renders a markdown report in chat (for review) and an HTML report appended to the ADO field — preserving prior content with a blank-line separator (no local file is written)
+  - Smoke-tested end-to-end (manual mode, all passed) — field write and ADO render verified
+## [1.9.0] - 2026-04-24
+### Added
+- **`/qa-research` command** — new standalone command that invokes the `qaa-project-researcher` agent to investigate the testing ecosystem for the current project. Supports 4 modes via `--focus` flag: `stack-testing` (default, full stack analysis), `framework-deep-dive` (deep dive into detected framework), `api-testing` (API testing strategy), `e2e-strategy` (E2E strategy). Produces research documents in `.qa-output/research/` consumed by all downstream agents.
+- **Research stage in `/qa-start` pipeline** — the full pipeline now runs `scan → research → codebase-map → analyze → ...`. The researcher agent runs after the scanner, using Context7 MCP as the primary source for framework documentation. Research is non-blocking: if it fails, downstream agents fall back to querying Context7 directly.
+- **Context7 MCP mandatory in 3 agents** — `qaa-executor`, `qaa-e2e-runner`, and `qaa-bug-detective` now have a non-negotiable "Framework Verification via Context7" section. Before generating code, fixing locators, or auto-fixing test errors, these agents MUST query Context7 for the framework's current API and syntax. This applies to ALL frameworks including well-known ones (Playwright, Cypress, Jest) — training data may be outdated.
+- **Research documents propagated across commands** — `/qa-create-test`, `/qa-fix`, and `/qa-start` now pass research documents (`FRAMEWORK_CAPABILITIES.md`, `TESTING_STACK.md`, `E2E_STRATEGY.md`, `API_TESTING_STRATEGY.md`) to executor, e2e-runner, and bug-detective agents via `files_to_read`. Documents are optional — if they don't exist, agents query Context7 directly.
+- **Context7 verification in quality gate checklists** — executor, e2e-runner, and bug-detective now include Context7-specific items in their verification checklists (e.g., "Context7 was queried for selector API before generating POM files").
+### Changed
+- **`qaa-project-researcher` agent updated** — Context7 MCP tools (`resolve-library-id`, `query-docs`) now declared explicitly in frontmatter. Output path changed from `specs/research/` to `.qa-output/research/`. Added version awareness section: detects project's current framework version, generates all syntax for that version, and includes an informational note about newer versions (without recommending upgrade).
+- **Pipeline diagram updated** — all pipeline references across CLAUDE.md, README.md, and workflows now include the `research` stage: `scan → research → codebase-map → analyze → ...`.
+- **Module boundaries and dependency ordering updated** — CLAUDE.md now includes `qa-project-researcher` in the module boundaries table, stage transitions table, dependency ordering, and read-before-write rules.
+### Fixed
+- **`qaa-project-researcher` was an orphaned agent** — the agent existed since v1.2.0 but was never invoked by any command, workflow, or the pipeline orchestrator. The `/qa-research` command documented in `docs/COMMANDS.md` never had a corresponding file in `/commands/`. No downstream agent read its output files. This release connects the researcher to the pipeline and creates the missing command.
+## [1.8.6] - 2026-04-20
+### Added
+- **Fix mode analyze-first flow in `/qa-fix`** — fix mode now runs in two phases: Phase 1 analyzes and classifies all failures without touching any files, then presents a Fix Plan to the user. Phase 2 executes auto-fixes only after explicit user confirmation. Users can refine the plan iteratively (add fixes, remove files, change approach) in a loop until satisfied, then approve or cancel. Replaces the previous fully-automatic fix behavior.
+- **Codebase map context for bug-detective** — `/qa-fix` fix mode now passes all 4 codebase map documents (`CODE_PATTERNS.md`, `API_CONTRACTS.md`, `TEST_SURFACE.md`, `TESTABILITY.md`) to the bug-detective agent via `files_to_read`. Previously the bug-detective classified failures without project-specific context, leading to less accurate classifications.
+- **Mandatory bash checklist in `/qa-fix`** — verification block at the end of `qa-fix.md` that forces the agent to run `ls`/`cat`/`grep` commands to confirm artifacts were produced (classification report, locator registry, codebase map, MCP evidence, test files). Matches the existing checklist pattern in `/qa-create-test`.
+### Changed
+- **Validator fix loops increased from 3 to 5** — `qaa-validator` agent now has up to 5 fix loop iterations (previously 3), matching the E2E runner's loop budget. Updated across all references: locked decision, fix loop logic, checkpoint return, confidence criteria table, and quality gate checks.
+## [1.8.5] - 2026-04-17
+### Added
+- **Azure DevOps mode in `/qa-create-test`** — new `--ado` flag enables creating Test Cases directly in Azure DevOps from a work item. Supports work item ID or full ADO URL, auto-detects `dev.azure.com` and `*.visualstudio.com` URLs. Features include: boundary value triplet detection (N-1, N, N+1), deduplication against existing linked TCs, confidence scoring (Specified vs Draft), keyword-based Critical tagging, and preconditions block per test case.
+- **`/qa-create-test-ado` standalone command** — dedicated command for Azure DevOps test case creation with 7-phase workflow: retrieve work item with comments/attachments, dedup check, type-based content extraction (Bug → Repro Steps, User Story → Acceptance Criteria), test case design, creation in ADO via `testplan_create_test_case`, structured report generation, and report attachment to source work item.
+- **ADO-specific flags** — `--area-path`, `--iteration-path` (override paths for created TCs), `--skip-dedup` (skip deduplication check).
+### Changed
+- **`/qa-create-test` now supports 5 modes** — from-code, from-ticket, ADO, update, and POM-only (previously 3 modes). Mode detection updated to recognize ADO URLs before ticket URLs to avoid routing conflicts.
+## [1.8.1] - 2026-04-16
+### Added
+- **Context7 MCP integration** — `@upstash/context7-mcp` is now bundled alongside Playwright MCP. The installer registers both MCP servers in the user-scope config (`~/.claude.json`) so they're available in every project on the machine, not just in the QAA repo. Context7 gives every QAA agent on-demand access to up-to-date library documentation (Playwright, Cypress, Jest, Vitest, pytest, and any other framework), keeping generated tests aligned with current APIs instead of outdated training data.
+- **`bin/install.cjs` installer script** — the file was referenced in `package.json` but didn't actually exist on npm, causing `npx qaa-agent` to fail silently (`No bin file found at bin/install.cjs`). The installer now performs three steps on every run: (1) copies agents, commands, skills, templates, workflows, docs, and config files into the chosen scope (`~/.claude/qaa` for global, `./.claude/qaa` for local), (2) registers both MCP servers in `~/.claude.json` with idempotency — existing entries are not duplicated, and (3) deep-merges the QAA permissions into the user's `settings.json` without overwriting their existing settings.
+### Changed
+- **MCP registration is now user-scope by default** — previously MCPs were defined only in the project-level `.mcp.json`, which meant they only activated when the user opened the QAA repo itself. They now register in `~/.claude.json`, making them available in every Claude Code project on the user's machine. The project-level `.mcp.json` is kept for QAA development purposes but is no longer the source of truth for end users.
+### Fixed
+- **Silent `npx qaa-agent` failure** — users who installed QAA via npm before this release did not get Playwright or Context7 MCPs registered because the installer script was missing from the published package. Publishing 1.8.1 restores the expected behavior: a single `npx qaa-agent` command copies all files and registers both MCPs globally.
+## [1.8.0] - 2026-04-13
+### Added
+- **Active verification checklist in every agent** — all 8 pipeline agents now end their body with a `## Before completing any task, verify each item actively:` section that forces the agent to run real `ls` + `cat` + `grep` commands against `.qa-output/` artifacts, the Locator Registry, codebase map documents, and `MY_PREFERENCES.md` before closing the task. The output of those commands lands in the subagent's context (recency effect), so the model cannot skip reading inputs or leave outputs unwritten without the verification failing.
+- **`skills:` declared in YAML frontmatter for every agent** — `qaa-analyzer`, `qaa-planner`, `qaa-executor`, `qaa-validator`, `qaa-e2e-runner`, `qaa-bug-detective`, `qaa-testid-injector`, `qaa-codebase-mapper`, `qa-pipeline-orchestrator`. Claude Code now injects the matching SKILL.md content at the start of the subagent's context when the Task tool spawns it. Previously subagents spawned with empty context and ignored the skill entirely.
+- **Non-negotiable rules section in `qaa-bug-detective`** — explicit rules for Locator Registry persistence and MY_PREFERENCES.md updates, placed mid-body as redundant reinforcement between the frontmatter (start) and the active checklist (end).
+- **`MY_PREFERENCES.md` reads propagated across slash commands** — `qa-create-test`, `qa-fix`, `qa-audit`, `qa-map` now pass `~/.claude/qaa/MY_PREFERENCES.md` to every spawned agent via `files_to_read`.
+- **Locator Registry reads propagated** — `qa-fix` and `qa-audit` now pass `.qa-output/locators/` to bug-detective, e2e-runner, and validator subagents.
+- **Playwright MCP usage is now non-negotiable in 4 agents** — `qaa-e2e-runner`, `qaa-testid-injector`, `qaa-bug-detective`, `qaa-executor` now hardcode Non-negotiable rules in their body that make live browser interaction via Playwright MCP **mandatory** (not optional) under the appropriate conditions. Previously agents sometimes skipped MCP calls even when the skill described them, because the description was advisory rather than enforced.
+- **MCP evidence files** at `.qa-output/mcp-evidence/{agent-name}-session.md` — every MCP-using agent now writes a structured evidence file per session logging `session_start`, `session_end`, URLs navigated, snapshots/screenshots taken, interactions performed, and `browser_closed: true`. The active verification checklist at the end of each agent runs `ls` + `grep` on this evidence file; missing or empty file = invalid run = hard failure.
+- **Skip-reason tracking** — when MCP is legitimately skipped (no `app_url`, non-E2E failure, MCP not connected), agents must document the skip reason in their primary report (TESTID_AUDIT_REPORT.md / FAILURE_CLASSIFICATION_REPORT.md). Silent skips are no longer permitted.
+- **Locator resolution priority chain — invention is forbidden** — `qaa-executor`, `qaa-e2e-runner`, and `qaa-bug-detective` now enforce a strict priority order when writing any locator: (1) Locator Registry first, (2) frontend source code `grep` second, (3) Playwright MCP live DOM snapshot third, (4) HALT if nothing resolvable. Agents MUST NOT invent `data-testid` values or guess CSS selectors. Every locator written to a generated file requires `source: registry | codebase | mcp` attribution in the MCP evidence file — anything else triggers file deletion or revert.
+- **Priority hit counts logged** — MCP evidence files now track `priority1_hits` (registry reuse), `priority2_hits` (source extraction), `priority3_hits` (MCP discovery), and `priority4_halts` (unresolvable elements), giving a full audit trail of where every locator came from.
+### Changed
+- **Agent reliability pattern: triple reinforcement** — every critical rule is now reinforced three times: (1) `skills:` frontmatter injection at the start of context, (2) `required_reading` + mid-body non-negotiable rules, (3) active `ls`/`cat`/`grep` verification at the end. This closes the "lost in the middle" attention gap documented in long-context LLM research.
+- **`qaa-bug-detective`, `qaa-executor`, `qaa-e2e-runner`, `qaa-validator`** — existing active checklists extended with `.qa-output/` specific items (generation plan, test inventory, codebase map, validation layers, failure classification evidence).
+### Fixed
+- **Subagent skill loss** — when a parent agent spawned a subagent via `Task()`, the subagent ran with fresh context and ignored the skill entirely (it had no way to know a skill existed). Declaring `skills:` in the YAML frontmatter fixes this at the Claude Code loader level.
+- **Artifact-read drift** — agents would sometimes reference `.qa-output/` artifacts in their reasoning without actually reading them. The active `grep` on specific content (e.g. "RISK_MAP HIGH items", "VALIDATION_REPORT confidence level") forces real consumption.
+## [1.7.0] - 2026-04-02
+### Added
+- **qaa-testid-injector**: Playwright MCP integration for live DOM verification before injection, codebase map reading (CODE_PATTERNS, TEST_SURFACE, TESTABILITY), and locator registry cross-referencing
+- **qaa-validator**: codebase map reading (CODE_PATTERNS, TEST_SURFACE, API_CONTRACTS) for structure and logic validation, locator registry cross-check for POM accuracy
+- **qaa-planner**: locator registry reading to assess E2E feasibility and improve complexity estimation
+- **qaa-analyzer**: locator registry reading to inform risk assessment and testing pyramid recommendations
+- **qaa-e2e-runner**: locator registry update after execution -- all discovered real locators are persisted
+- **qa-validate workflow**: now passes codebase map and locator registry to validator agent
+- **qa-gap workflow**: now passes codebase map and locator registry to analyzer agent
+- **qa-testid workflow**: now passes codebase map, locator registry, and app_url to injector agent
+### Changed
+- **E2E runner max fix loops: 3 → 5** -- more attempts to fix locator/assertion mismatches before giving up
+- **Installer**: updated paths for new package structure (commands/ and skills/ at root level), updated command list to reflect 7 consolidated commands
+- **Package structure**: commands and skills now live at package root instead of `.claude/` subdirectory
+- **Repository**: moved to new GitHub organization
+### Consolidated
+- 7 slash commands: `/qa-start`, `/qa-create-test`, `/qa-map`, `/qa-testid`, `/qa-pr`, `/qa-audit`, `/qa-fix`
+- Removed standalone `/qa-analyze`, `/qa-validate`, `/qa-gap` -- integrated into other commands
+## [1.6.0] - 2026-03-25
+### Added
+- Playwright MCP server bundled in agent package (`.mcp.json`) -- starts automatically when opening project in Claude Code
+- Persistent locator registry at `.qa-output/locators/` -- accumulates real locators across features over time
+  - Per-feature files: `{feature}.locators.md` -- extracted locators for each feature tested
+  - Central index: `LOCATOR_REGISTRY.md` -- all locators by page, searchable by any command
+- Browser-based locator extraction step in `/create-test` and `/qa-from-ticket` -- navigates live app with Playwright MCP and captures real data-testid, ARIA roles, and labels before generating tests
+- Registry cache: if locators for a feature already exist in the registry, browser extraction is skipped (reuses cached locators)
+- `--app-url` flag added to `/qa-from-ticket`
+- CHANGELOG.md
+### Changed
+- `qaa-executor` now reads locator registry (when available) to use real locators in POMs instead of proposing them
+- `/create-test` flow: checks registry first, then extracts via browser if needed, BEFORE test generation
+- `/qa-from-ticket` workflow: locator extraction step added after source scan, before test case generation
+### Removed
+- `/qa-analyze` command (deprecated since v1.4.0, fully replaced by `/qa-map`)
+## [1.5.0] - 2026-03-24
+### Added
+- Stable release
+## [1.4.0]
+### Changed
+- Merged `/qa-analyze` into `/qa-map` -- single command for codebase scanning and analysis
+- Consolidated pipeline flow
+### Deprecated
+- `/qa-analyze` command (use `/qa-map` instead)
+## [1.3.0]
+### Added
+- `qa-learner` skill -- persistent preferences from user corrections
+- Preferences saved to `~/.claude/qaa/MY_PREFERENCES.md`
+- Trigger detection for English and Spanish frustration signals
+## [1.2.0]
+### Added
+- `qaa-codebase-mapper` agent -- 4 parallel fo