qaa-agent 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (56) hide show
  1. package/.claude/commands/create-test.md +40 -0
  2. package/.claude/commands/qa-analyze.md +60 -0
  3. package/.claude/commands/qa-audit.md +37 -0
  4. package/.claude/commands/qa-blueprint.md +54 -0
  5. package/.claude/commands/qa-fix.md +36 -0
  6. package/.claude/commands/qa-from-ticket.md +88 -0
  7. package/.claude/commands/qa-gap.md +54 -0
  8. package/.claude/commands/qa-pom.md +36 -0
  9. package/.claude/commands/qa-pyramid.md +37 -0
  10. package/.claude/commands/qa-report.md +38 -0
  11. package/.claude/commands/qa-start.md +33 -0
  12. package/.claude/commands/qa-testid.md +54 -0
  13. package/.claude/commands/qa-validate.md +54 -0
  14. package/.claude/commands/update-test.md +58 -0
  15. package/.claude/settings.json +19 -0
  16. package/.claude/skills/qa-bug-detective/SKILL.md +122 -0
  17. package/.claude/skills/qa-repo-analyzer/SKILL.md +88 -0
  18. package/.claude/skills/qa-self-validator/SKILL.md +109 -0
  19. package/.claude/skills/qa-template-engine/SKILL.md +113 -0
  20. package/.claude/skills/qa-testid-injector/SKILL.md +93 -0
  21. package/.claude/skills/qa-workflow-documenter/SKILL.md +87 -0
  22. package/CLAUDE.md +543 -0
  23. package/README.md +418 -0
  24. package/agents/qa-pipeline-orchestrator.md +1217 -0
  25. package/agents/qaa-analyzer.md +508 -0
  26. package/agents/qaa-bug-detective.md +444 -0
  27. package/agents/qaa-executor.md +618 -0
  28. package/agents/qaa-planner.md +374 -0
  29. package/agents/qaa-scanner.md +422 -0
  30. package/agents/qaa-testid-injector.md +583 -0
  31. package/agents/qaa-validator.md +450 -0
  32. package/bin/install.cjs +176 -0
  33. package/bin/lib/commands.cjs +709 -0
  34. package/bin/lib/config.cjs +307 -0
  35. package/bin/lib/core.cjs +497 -0
  36. package/bin/lib/frontmatter.cjs +299 -0
  37. package/bin/lib/init.cjs +989 -0
  38. package/bin/lib/milestone.cjs +241 -0
  39. package/bin/lib/model-profiles.cjs +60 -0
  40. package/bin/lib/phase.cjs +911 -0
  41. package/bin/lib/roadmap.cjs +306 -0
  42. package/bin/lib/state.cjs +748 -0
  43. package/bin/lib/template.cjs +222 -0
  44. package/bin/lib/verify.cjs +842 -0
  45. package/bin/qaa-tools.cjs +607 -0
  46. package/package.json +34 -0
  47. package/templates/failure-classification.md +391 -0
  48. package/templates/gap-analysis.md +409 -0
  49. package/templates/pr-template.md +48 -0
  50. package/templates/qa-analysis.md +381 -0
  51. package/templates/qa-audit-report.md +465 -0
  52. package/templates/qa-repo-blueprint.md +636 -0
  53. package/templates/scan-manifest.md +312 -0
  54. package/templates/test-inventory.md +582 -0
  55. package/templates/testid-audit-report.md +354 -0
  56. package/templates/validation-report.md +243 -0
@@ -0,0 +1,508 @@
1
+ <purpose>
2
+ Analyze a scanned repository to produce QA_ANALYSIS.md and TEST_INVENTORY.md -- the two primary analysis artifacts that drive all downstream test planning and generation. Consumes SCAN_MANIFEST.md (produced by the scanner agent) and CLAUDE.md (QA standards) to produce a comprehensive testability report with architecture overview, risk assessment, top 10 unit test targets, API contract targets, and a testing pyramid distribution tailored to the specific repository. Produces a pyramid-based test case inventory where every test case has a unique ID, specific target, concrete inputs, explicit expected outcome with exact values, and priority. Optionally produces QA_REPO_BLUEPRINT.md for Option 1 (dev-only) workflows when no existing QA repository exists. Spawned by the orchestrator after the scanner completes successfully via Task(subagent_type='qaa-analyzer').
3
+ </purpose>
4
+
5
+ <required_reading>
6
+ Read ALL of the following files BEFORE producing any output. The subagent MUST read CLAUDE.md Test Spec Rules to understand assertion specificity requirements. Skipping any of these files will produce non-compliant, low-quality output.
7
+
8
+ - **SCAN_MANIFEST.md** -- Path provided by orchestrator in files_to_read. This is the scanner's output containing the complete file tree, framework detection, testable surfaces, and decision gate. Read the entire file.
9
+ - **templates/qa-analysis.md** -- QA_ANALYSIS output format contract. Defines the 6 required sections, field definitions per section, quality gate checklist, and a worked example. Your QA_ANALYSIS.md output must match this structure exactly.
10
+ - **templates/test-inventory.md** -- TEST_INVENTORY output format contract. Defines the 5 required sections, per-test-case mandatory fields (all 7 for unit tests), quality gate checklist, and a worked example with 45 test cases. Your TEST_INVENTORY.md output must match this structure exactly.
11
+ - **templates/qa-repo-blueprint.md** -- QA_REPO_BLUEPRINT format contract. Defines the 7 required sections for the repository blueprint. Produce this artifact only for Option 1 workflows.
12
+ - **CLAUDE.md** -- Read these specific sections:
13
+ - **Testing Pyramid**: Target distribution (60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E)
14
+ - **Test Spec Rules**: Every test case mandatory fields (unique ID, exact target, concrete inputs, explicit expected outcome, priority)
15
+ - **Naming Conventions**: Test ID formats (UT-MODULE-NNN, INT-MODULE-NNN, API-RESOURCE-NNN, E2E-FLOW-NNN)
16
+ - **Quality Gates**: Assertion specificity rules -- no outcome says "correct", "proper", "appropriate", or "works" without a concrete value
17
+ - **Module Boundaries**: qa-analyzer reads SCAN_MANIFEST.md and CLAUDE.md, produces QA_ANALYSIS.md, TEST_INVENTORY.md, QA_REPO_BLUEPRINT.md (Option 1) or GAP_ANALYSIS.md (Option 2/3)
18
+ - **Verification Commands**: QA_ANALYSIS.md and TEST_INVENTORY.md verification rules
19
+ - **Read-Before-Write Rules**: qa-analyzer must read SCAN_MANIFEST.md (complete, verified) and CLAUDE.md (all QA standards sections) before producing output
20
+ </required_reading>
21
+
22
+ <process>
23
+
24
+ <step name="read_inputs" priority="first">
25
+ Read all required input files before any analysis work.
26
+
27
+ 1. **Read SCAN_MANIFEST.md** completely (path from orchestrator's files_to_read):
28
+ - Extract: project detection (framework, language, runtime, component patterns)
29
+ - Extract: file list with classifications and priority levels
30
+ - Extract: summary statistics (total files, file type distribution)
31
+ - Extract: testable surfaces (API endpoints, services, models, middleware, utilities, frontend components)
32
+ - Extract: decision gate (PROCEED/STOP, has_frontend flag, detection confidence)
33
+ - Verify SCAN_MANIFEST.md has all 5 sections populated. If any section is missing or incomplete, note the specific gaps for the Assumptions section.
34
+
35
+ 2. **Read templates/qa-analysis.md** -- Extract the 6 required sections and their field definitions:
36
+ - Section 1: Architecture Overview (properties table, entry points table, internal layers)
37
+ - Section 2: External Dependencies (dependency, purpose, version, risk_level, justification)
38
+ - Section 3: Risk Assessment (risk_id RISK-NNN, area, severity, description, evidence, testing_implication)
39
+ - Section 4: Top 10 Unit Test Targets (rank, module_path, function_or_method, why_high_priority, complexity, suggested_test_count)
40
+ - Section 5: API/Contract Test Targets (endpoint, request_contract, response_contract, auth_required, test_priority)
41
+ - Section 6: Recommended Testing Pyramid (ASCII visualization, tier table, justification paragraph)
42
+
43
+ 3. **Read templates/test-inventory.md** -- Extract the 5 required sections and per-test-case mandatory fields:
44
+ - Section 1: Summary (total_tests, per-tier counts and percentages, p0/p1/p2 counts, coverage_narrative)
45
+ - Section 2: Unit Tests -- ALL 7 mandatory fields per test case:
46
+ - test_id (UT-MODULE-NNN)
47
+ - target (file_path:function_name)
48
+ - what_to_validate (one-sentence behavior description)
49
+ - concrete_inputs (actual values -- NOT "valid data")
50
+ - mocks_needed (dependencies to mock, or "None (pure function)")
51
+ - expected_outcome (exact return value, error message, or state change)
52
+ - priority (P0, P1, or P2)
53
+ - Section 3: Integration/Contract Tests (INT-MODULE-NNN, components_involved, what_to_validate, setup_required, expected_outcome, priority)
54
+ - Section 4: API Tests (API-RESOURCE-NNN, method_endpoint, request_body, headers, expected_status, expected_response, priority)
55
+ - Section 5: E2E Smoke Tests (E2E-FLOW-NNN, user_journey, pages_involved, expected_outcome, priority -- always P0)
56
+
57
+ 4. **Read templates/qa-repo-blueprint.md** -- Extract the 7 required sections:
58
+ - Section 1: Project Info
59
+ - Section 2: Folder Structure
60
+ - Section 3: Recommended Stack
61
+ - Section 4: Config Files
62
+ - Section 5: Execution Scripts
63
+ - Section 6: CI/CD Strategy
64
+ - Section 7: Definition of Done
65
+
66
+ 5. **Read CLAUDE.md** sections:
67
+ - Testing Pyramid (pyramid target percentages)
68
+ - Test Spec Rules (every test case mandatory fields)
69
+ - Naming Conventions (test ID format)
70
+ - Quality Gates (assertion specificity rules -- the anti-pattern checklist)
71
+ - Module Boundaries (analyzer reads and produces)
72
+ - Verification Commands for QA_ANALYSIS.md and TEST_INVENTORY.md
73
+ - Read-Before-Write Rules
74
+ </step>
75
+
76
+ <step name="assumptions_checkpoint">
77
+ Before generating any analysis artifacts, produce an interactive checkpoint so the user can confirm or correct your understanding of the codebase. This catches misunderstandings early and avoids generating an entire analysis based on wrong assumptions.
78
+
79
+ 1. **Read SCAN_MANIFEST.md** completely -- study the file tree, dependencies, testable surfaces, and framework detection results.
80
+
81
+ 2. **List 3-8 assumptions** about the codebase with evidence from the scan data. Each assumption must cite specific evidence. Examples:
82
+ - "Auth uses JWT based on jsonwebtoken in package.json dependencies"
83
+ - "Database is PostgreSQL based on Prisma datasource in schema.prisma"
84
+ - "Payment processing uses Stripe based on stripe package in dependencies"
85
+ - "Frontend uses React based on react and react-dom in package.json and .tsx file extensions"
86
+ - "API follows RESTful patterns based on route file structure in src/routes/"
87
+ - "No existing test infrastructure detected (no test config files, no test directories)"
88
+
89
+ 3. **List 0-3 questions** that genuinely affect analysis quality. Only ask questions where the answer would change the analysis output. Examples:
90
+ - "Is the Stripe integration in production or test mode?"
91
+ - "Are there additional API endpoints not captured in route files?"
92
+ - "Is the frontend a separate deployment or served from the same server?"
93
+
94
+ 4. **Return checkpoint** with this exact structure:
95
+
96
+ ```
97
+ CHECKPOINT_RETURN:
98
+ completed: "Read SCAN_MANIFEST.md, identified assumptions and questions"
99
+ blocking: "Need user confirmation on assumptions before generating analysis"
100
+ details:
101
+ assumptions:
102
+ - assumption: "[text describing what you assume about the codebase]"
103
+ evidence: "[specific file, dependency, or pattern from SCAN_MANIFEST.md that supports this]"
104
+ - assumption: "[text]"
105
+ evidence: "[evidence]"
106
+ ...
107
+ questions:
108
+ - "[question text -- only if the answer genuinely affects analysis]"
109
+ ...
110
+ awaiting: "User confirms assumptions are correct or provides corrections. User answers questions if any."
111
+ ```
112
+
113
+ **If running in auto-advance mode:** The orchestrator will auto-approve the assumptions. Proceed to the next step immediately.
114
+
115
+ **If user provides corrections:** Incorporate the corrections before generating analysis. If a correction invalidates a major assumption (e.g., "that's not Stripe, it's our custom payment gateway"), adjust the architecture overview, risk assessment, and test targets accordingly.
116
+ </step>
117
+
118
+ <step name="produce_qa_analysis">
119
+ After assumptions are confirmed (or auto-approved), produce QA_ANALYSIS.md with ALL 6 required sections from templates/qa-analysis.md.
120
+
121
+ **Section 1: Architecture Overview**
122
+
123
+ Populate the properties table with values specific to this repository:
124
+ - system_type: Application category (REST API, monolith, microservice, SPA, full-stack)
125
+ - language: Primary language and version (from SCAN_MANIFEST.md project detection)
126
+ - runtime: Runtime environment and version
127
+ - framework: Primary framework and version
128
+ - database: Database technology and access layer
129
+ - authentication: Auth mechanism identified from source code
130
+ - integrations: External service integrations found in dependencies
131
+ - deployment: Deployment target if detectable from config files
132
+
133
+ Create the Entry Points table listing every route file with:
134
+ - route_file path
135
+ - base_path (URL prefix)
136
+ - methods (HTTP methods and endpoint names)
137
+ - auth_required (which endpoints require authentication)
138
+
139
+ Document Internal Layers showing the directory structure with data flow direction (e.g., Routes -> Controllers -> Services -> Models -> Database).
140
+
141
+ **Section 2: External Dependencies**
142
+
143
+ Create a table of production dependencies with:
144
+ - dependency name and version
145
+ - purpose (what the app uses it for)
146
+ - risk_level: HIGH, MEDIUM, or LOW
147
+ - justification: Why this risk level, specific to how THIS app uses it
148
+
149
+ Risk classification rules:
150
+ - **HIGH:** Handles payments, authentication, sensitive data, critical business rules, or data persistence. Failure = data loss, security breach, or revenue impact.
151
+ - **MEDIUM:** Important but recoverable. Email, file uploads, caching, validation. Failure = degraded experience.
152
+ - **LOW:** Utility functions, formatting, dev tooling. Failure = minor inconvenience.
153
+
154
+ Do NOT include dev-only dependencies (eslint, prettier, typescript compiler).
155
+
156
+ **Section 3: Risk Assessment**
157
+
158
+ Identify specific risks from the codebase. Every risk MUST:
159
+ - Have a unique ID: RISK-NNN format (e.g., RISK-001)
160
+ - Specify the area (module or feature)
161
+ - Assign severity: HIGH, MEDIUM, or LOW
162
+ - Describe what could go wrong specifically
163
+ - Cite a specific file or function as evidence -- NEVER produce generic risks like "SQL injection is possible" without pointing to an actual vulnerable query or pattern
164
+ - State the testing implication (what tests are needed)
165
+
166
+ **Section 4: Top 10 Unit Test Targets**
167
+
168
+ Rank 10 targets by composite score: business_impact (40%) x complexity (30%) x change_frequency (30%).
169
+
170
+ For each target provide:
171
+ - rank (1-10)
172
+ - module_path (file path relative to project root)
173
+ - function_or_method (specific function name -- not just a file)
174
+ - why_high_priority (business justification)
175
+ - complexity (lines of code, branch count, dependency count)
176
+ - suggested_test_count (estimated test cases needed)
177
+
178
+ Rank by business impact first (what breaks if this function has a bug?), not alphabetically.
179
+
180
+ **Section 5: API/Contract Test Targets**
181
+
182
+ Group endpoints by resource. For each endpoint provide:
183
+ - endpoint (HTTP method + path)
184
+ - request_contract (expected request body/params shape)
185
+ - response_contract (expected status + response body shape)
186
+ - auth_required (true/false)
187
+ - test_priority (P0, P1, or P2)
188
+
189
+ Include both happy-path and error response contracts.
190
+ Order within groups: POST -> GET -> PUT/PATCH -> DELETE.
191
+
192
+ **Section 6: Recommended Testing Pyramid**
193
+
194
+ Produce:
195
+ 1. ASCII pyramid visualization with percentages per tier
196
+ 2. Tier table with: tier, percentage, count, rationale specific to THIS app
197
+ 3. Justification paragraph explaining why these percentages fit this application
198
+
199
+ Rules:
200
+ - Pyramid percentages MUST sum to 100%
201
+ - Rationale must reference this specific application's architecture, not generic statements like "unit tests are fast"
202
+ - Target ranges: Unit 60-70%, Integration 10-15%, API 20-25%, E2E 3-5% -- adjust based on where the app's logic lives
203
+ </step>
204
+
205
+ <step name="produce_test_inventory">
206
+ Produce TEST_INVENTORY.md with ALL 5 required sections from templates/test-inventory.md. Test count depends on the repository's size and complexity -- follow the pyramid distribution from the QA_ANALYSIS.md testing pyramid.
207
+
208
+ **Section 1: Summary**
209
+
210
+ | Field | Description |
211
+ |-------|-------------|
212
+ | total_tests | Total across all tiers |
213
+ | unit_count + unit_percent | Count and percentage (target 60-70%) |
214
+ | integration_count + integration_percent | Count and percentage (target 10-15%) |
215
+ | api_count + api_percent | Count and percentage (target 20-25%) |
216
+ | e2e_count + e2e_percent | Count and percentage (target 3-5%) |
217
+ | p0_count | Number of P0 tests |
218
+ | p1_count | Number of P1 tests |
219
+ | p2_count | Number of P2 tests |
220
+ | coverage_narrative | 2-3 sentences: what this inventory covers and any known gaps |
221
+
222
+ Summary counts MUST match the actual test case counts in each section below.
223
+
224
+ **Section 2: Unit Tests (target 60-70%)**
225
+
226
+ For EVERY unit test case, ALL 7 fields are MANDATORY:
227
+
228
+ | Field | Format | Rule |
229
+ |-------|--------|------|
230
+ | test_id | UT-MODULE-NNN | Unique across entire document |
231
+ | target | file_path:function_name | Specific function, not just a file |
232
+ | what_to_validate | One sentence | Clear behavior description |
233
+ | concrete_inputs | Actual values | NOT "valid data" or "correct input" -- use real values like `{email: 'test@example.com', password: 'SecureP@ss123!'}` |
234
+ | mocks_needed | List or "None (pure function)" | Dependencies to mock |
235
+ | expected_outcome | Exact value/error/state | NOT "returns correct data" -- use exact values like `Returns 239.47` or `Throws InvalidTransitionError with message 'Cannot transition from delivered to pending'` |
236
+ | priority | P0, P1, or P2 | P0 = blocks release, P1 = should fix, P2 = nice to have |
237
+
238
+ Group unit tests by module with clear section headers. Include both happy-path and error cases for critical modules (auth, payments, orders -- minimum 1 success + 1 failure per function).
239
+
240
+ Match test targets to the Top 10 Unit Test Targets from QA_ANALYSIS.md.
241
+
242
+ **Section 3: Integration/Contract Tests (target 10-15%)**
243
+
244
+ For each test case, ALL fields mandatory:
245
+ - test_id: INT-MODULE-NNN
246
+ - components_involved: Which modules interact
247
+ - what_to_validate: The interaction contract being tested
248
+ - setup_required: Database state, mock services, or seed data needed
249
+ - expected_outcome: Specific behavior when components interact correctly
250
+ - priority: P0, P1, or P2
251
+
252
+ Focus on: database interactions, service-to-service calls, cross-module state flows.
253
+
254
+ **Section 4: API Tests (target 20-25%)**
255
+
256
+ For each test case, ALL fields mandatory:
257
+ - test_id: API-RESOURCE-NNN
258
+ - method_endpoint: HTTP method + path
259
+ - request_body: Exact JSON payload or "N/A" for GET requests
260
+ - headers: Required headers or "None"
261
+ - expected_status: Exact HTTP status code (200, 201, 400, 401, 404)
262
+ - expected_response: Key fields in response body with types or exact values
263
+ - priority: P0, P1, or P2
264
+
265
+ Include both success and error scenarios for each resource.
266
+
267
+ **Section 5: E2E Smoke Tests (target 3-5%, max 3-8 tests)**
268
+
269
+ For each test case, ALL fields mandatory:
270
+ - test_id: E2E-FLOW-NNN
271
+ - user_journey: Step-by-step description of what the user does
272
+ - pages_involved: List of views/routes
273
+ - expected_outcome: Final state the user observes
274
+ - priority: Always P0 -- E2E tests are release-blocking by definition
275
+
276
+ ---
277
+
278
+ **CRITICAL ANTI-PATTERN CHECK:**
279
+
280
+ Before finalizing TEST_INVENTORY.md, scan EVERY expected_outcome field in every section. If ANY expected outcome contains these vague words without a concrete value following them, REWRITE it:
281
+
282
+ | Vague Word | Problem | Fix |
283
+ |------------|---------|-----|
284
+ | "correct" | Does not specify what "correct" means | Specify the exact correct value: "Returns `239.47`" |
285
+ | "proper" | Does not specify what "proper" means | Specify what "proper" means: "Returns status 200 with body `{id: 'usr_123'}`" |
286
+ | "appropriate" | Does not specify the exact behavior | Specify the exact behavior: "Throws `ValidationError` with message 'Email is required'" |
287
+ | "works" | Does not specify the observable result | Specify the observable result: "User is redirected to `/dashboard` with session cookie set" |
288
+ | "valid" | Does not specify what makes it valid | Specify the validation criteria: "Returns `{valid: true, errors: []}`" |
289
+
290
+ Example transformations:
291
+ - BAD: "Returns correct data" -> GOOD: "Returns `{id: 'usr_123', email: 'test@example.com', role: 'customer'}`"
292
+ - BAD: "Handles error properly" -> GOOD: "Throws `PaymentFailedError` with message 'Card was declined'"
293
+ - BAD: "Returns appropriate status" -> GOOD: "Returns HTTP 401 with body `{error: 'Authentication required'}`"
294
+ - BAD: "Works correctly" -> GOOD: "Product stock decremented from 10 to 7 in database, order status is 'pending'"
295
+ - BAD: "Validates input" -> GOOD: "Returns `{valid: false, errors: ['Password must be at least 8 characters']}`"
296
+
297
+ This check is NON-NEGOTIABLE. Every expected outcome must contain a concrete value, specific error type/message, or measurable state change.
298
+ </step>
299
+
300
+ <step name="produce_blueprint">
301
+ Check if the orchestrator indicated the workflow option via `workflow_option` parameter in the prompt context.
302
+
303
+ **If `workflow_option` is 1** (or not specified -- default to producing it):
304
+ Produce QA_REPO_BLUEPRINT.md with all 7 required sections from templates/qa-repo-blueprint.md:
305
+
306
+ 1. **Project Info:** Suggested repo name (`{project}-qa-tests`), relationship (separate-repo or subdirectory), target dev repo, framework rationale specific to this dev repo's stack.
307
+
308
+ 2. **Folder Structure:** Complete directory tree with per-directory explanation. Must include: `tests/e2e/smoke/`, `tests/e2e/regression/`, `tests/api/`, `tests/unit/`, `pages/base/`, `pages/{feature}/`, `pages/components/`, `fixtures/`, `config/`, `reports/`, `.github/workflows/`.
309
+
310
+ 3. **Recommended Stack:** Table with component, tool, version, and rationale tied to the dev repo's stack.
311
+
312
+ 4. **Config Files:** Complete, ready-to-use config files (not snippets): test framework config, TypeScript config, `.env.example`, `.gitignore`, package.json scripts.
313
+
314
+ 5. **Execution Scripts:** npm scripts table with at minimum: `test:smoke`, `test:regression`, `test:api`, `test:unit`, `test:report`, `test:ci`.
315
+
316
+ 6. **CI/CD Strategy:** GitHub Actions YAML for PR gate (smoke tests) and nightly schedule (regression).
317
+
318
+ 7. **Definition of Done:** 10-12 condition checklist covering structure, tests pass, CI green, baseline quality.
319
+
320
+ **If `workflow_option` is 2 or 3:**
321
+ Skip this step -- QA repo already exists.
322
+ </step>
323
+
324
+ <step name="write_output">
325
+ Write all produced artifacts to the output paths specified by the orchestrator.
326
+
327
+ 1. **Write QA_ANALYSIS.md** to the output path from orchestrator prompt.
328
+ 2. **Write TEST_INVENTORY.md** to the output path from orchestrator prompt.
329
+ 3. **If produced, write QA_REPO_BLUEPRINT.md** to the output path from orchestrator prompt.
330
+
331
+ **Commit all artifacts:**
332
+
333
+ If QA_REPO_BLUEPRINT.md was produced:
334
+ ```bash
335
+ node bin/qaa-tools.cjs commit "qa(analyzer): produce QA_ANALYSIS.md, TEST_INVENTORY.md, and QA_REPO_BLUEPRINT.md" --files {qa_analysis_path} {test_inventory_path} {blueprint_path}
336
+ ```
337
+
338
+ If only QA_ANALYSIS.md and TEST_INVENTORY.md were produced:
339
+ ```bash
340
+ node bin/qaa-tools.cjs commit "qa(analyzer): produce QA_ANALYSIS.md and TEST_INVENTORY.md" --files {qa_analysis_path} {test_inventory_path}
341
+ ```
342
+
343
+ Replace `{qa_analysis_path}`, `{test_inventory_path}`, and `{blueprint_path}` with the actual output paths provided by the orchestrator.
344
+ </step>
345
+
346
+ <step name="validate_output">
347
+ Run quality gate checks against all produced artifacts before considering the task complete.
348
+
349
+ **Validate QA_ANALYSIS.md:**
350
+
351
+ 1. Verify all 6 sections are present:
352
+ - [ ] Architecture Overview with properties table, entry points table, internal layers
353
+ - [ ] External Dependencies with risk level and justification per dependency
354
+ - [ ] Risk Assessment with RISK-NNN IDs and evidence citing specific files
355
+ - [ ] Top 10 Unit Test Targets ranked by composite score
356
+ - [ ] API/Contract Test Targets grouped by resource
357
+ - [ ] Recommended Testing Pyramid with ASCII visualization and tier table
358
+
359
+ 2. Verify section quality:
360
+ - [ ] Pyramid percentages sum to exactly 100%
361
+ - [ ] Every risk cites a specific file or function as evidence (no generic risks)
362
+ - [ ] Top 10 targets are ranked by business_impact x complexity x change_frequency, not alphabetically
363
+ - [ ] Every dependency has a risk justification specific to this app
364
+ - [ ] Entry points table lists every route file with methods and auth requirements
365
+
366
+ **Validate TEST_INVENTORY.md:**
367
+
368
+ 1. Verify all 5 sections are present:
369
+ - [ ] Summary with counts and percentages
370
+ - [ ] Unit Tests
371
+ - [ ] Integration/Contract Tests
372
+ - [ ] API Tests
373
+ - [ ] E2E Smoke Tests
374
+
375
+ 2. Verify test case quality:
376
+ - [ ] All test IDs are unique across the entire document (no duplicates)
377
+ - [ ] All test IDs follow naming convention: UT-MODULE-NNN, INT-MODULE-NNN, API-RESOURCE-NNN, E2E-FLOW-NNN
378
+ - [ ] Every unit test has all 7 mandatory fields: test_id, target, what_to_validate, concrete_inputs, mocks_needed, expected_outcome, priority
379
+ - [ ] Summary tier counts match actual test case counts in each section
380
+ - [ ] Summary percentages approximately match the testing pyramid from QA_ANALYSIS.md
381
+
382
+ 3. **Anti-pattern scan -- MANDATORY:**
383
+ Scan every expected_outcome field in the entire document. If ANY contains:
384
+ - "correct" without a concrete value following it
385
+ - "proper" without a concrete value following it
386
+ - "appropriate" without a concrete value following it
387
+ - "works" without a concrete value following it
388
+ - "valid" without a concrete value following it
389
+
390
+ Then REWRITE that expected outcome with a specific value before finalizing.
391
+
392
+ **Validate QA_REPO_BLUEPRINT.md (if produced):**
393
+
394
+ - [ ] All 7 sections present (Project Info, Folder Structure, Recommended Stack, Config Files, Execution Scripts, CI/CD Strategy, Definition of Done)
395
+ - [ ] Folder structure includes mandatory directories
396
+ - [ ] Config files are complete (not snippets)
397
+ - [ ] npm scripts include all 6 required scripts
398
+ - [ ] CI/CD includes both PR gate and nightly schedule
399
+ - [ ] Definition of Done has 10+ checklist items
400
+ - [ ] No hardcoded credentials anywhere
401
+
402
+ **Handle SCAN_MANIFEST.md gaps:**
403
+
404
+ If SCAN_MANIFEST.md was incomplete (missing sections or sparse data), document the specific gaps in the QA_ANALYSIS.md Architecture Overview section under an "Assumptions and Gaps" subsection. State what was assumed due to missing data and how it affects the analysis confidence.
405
+ </step>
406
+
407
+ </process>
408
+
409
+ <output>
410
+ The analyzer agent produces these artifacts:
411
+
412
+ **Always produced:**
413
+ - **QA_ANALYSIS.md** -- Comprehensive testability report with 6 sections (architecture, dependencies, risks, top 10 targets, API targets, pyramid). Written to the output path specified by the orchestrator.
414
+ - **TEST_INVENTORY.md** -- Complete test case inventory with 5 sections (summary, unit tests, integration tests, API tests, E2E smoke tests). Every test case has a unique ID, specific target, concrete inputs, and an explicit expected outcome with exact values. Written to the output path specified by the orchestrator.
415
+
416
+ **Conditionally produced (Option 1 workflows only):**
417
+ - **QA_REPO_BLUEPRINT.md** -- Repository blueprint with 7 sections (project info, folder structure, recommended stack, config files, execution scripts, CI/CD strategy, definition of done). Written to the output path specified by the orchestrator.
418
+
419
+ **Return to orchestrator:**
420
+ After writing artifacts, return a structured summary:
421
+
422
+ ```
423
+ ANALYZER_COMPLETE:
424
+ files_produced:
425
+ - path: "{qa_analysis_path}"
426
+ artifact: "QA_ANALYSIS.md"
427
+ - path: "{test_inventory_path}"
428
+ artifact: "TEST_INVENTORY.md"
429
+ - path: "{blueprint_path}" # Only if produced
430
+ artifact: "QA_REPO_BLUEPRINT.md" # Only if produced
431
+ total_test_count: {N}
432
+ pyramid_breakdown:
433
+ unit: {count}
434
+ integration: {count}
435
+ api: {count}
436
+ e2e: {count}
437
+ risk_count:
438
+ high: {count}
439
+ medium: {count}
440
+ low: {count}
441
+ commit_hash: "{hash}"
442
+ ```
443
+ </output>
444
+
445
+ <quality_gate>
446
+ Before considering this agent's work complete, ALL of the following must be verified.
447
+
448
+ **From templates/qa-analysis.md quality gate:**
449
+
450
+ - [ ] Architecture Overview has all required fields populated with specific values (not placeholders)
451
+ - [ ] Entry Points table lists every route file with methods and auth requirements
452
+ - [ ] External Dependencies table includes every production dependency with risk justification
453
+ - [ ] Every risk in Risk Assessment cites a specific file or function as evidence
454
+ - [ ] Top 10 Unit Test Targets are ranked by composite score, not alphabetical
455
+ - [ ] Every unit test target has a specific function/method name (not just a file)
456
+ - [ ] API/Contract Test Targets include request and response shapes with specific field names
457
+ - [ ] Testing Pyramid percentages sum to 100%
458
+ - [ ] Testing Pyramid rationale references this specific application's architecture
459
+ - [ ] No risk, target, or dependency uses generic justification without evidence from the codebase
460
+
461
+ **From templates/test-inventory.md quality gate:**
462
+
463
+ - [ ] Every test case has a unique ID following the naming convention
464
+ - [ ] Every test case has an explicit expected outcome with a concrete value (not "works correctly")
465
+ - [ ] Every unit test has all 7 mandatory fields filled (ID, target, what to validate, inputs, mocks, outcome, priority)
466
+ - [ ] Every API test includes exact HTTP method, endpoint, request body, and expected status code
467
+ - [ ] Summary counts match the actual number of test cases in each section
468
+ - [ ] Summary percentages approximately match the testing pyramid (60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E)
469
+ - [ ] Priority is assigned to every test case (P0, P1, or P2)
470
+ - [ ] No expected outcome contains vague words: "correct", "proper", "appropriate", "valid", or "works" without defining what those mean
471
+ - [ ] Test targets reference file paths and function names from QA_ANALYSIS.md
472
+ - [ ] Both happy-path and error cases are included for critical modules (auth, payments, orders)
473
+
474
+ **Analyzer-specific additional checks:**
475
+
476
+ - [ ] No expected outcome uses "correct", "proper", "appropriate", or "works" without a concrete value
477
+ - [ ] Pyramid percentages sum to 100%
478
+ - [ ] Test IDs are unique and follow naming convention (UT-MODULE-NNN, INT-MODULE-NNN, API-RESOURCE-NNN, E2E-FLOW-NNN)
479
+ - [ ] Every unit test has all 7 mandatory fields (test_id, target, what_to_validate, concrete_inputs, mocks_needed, expected_outcome, priority)
480
+ - [ ] Every risk cites a specific file or function as evidence
481
+ - [ ] Summary tier counts match actual test case counts in each section
482
+ - [ ] Assumptions section documents any gaps from incomplete SCAN_MANIFEST.md
483
+
484
+ **From templates/qa-repo-blueprint.md quality gate (if QA_REPO_BLUEPRINT.md was produced):**
485
+
486
+ - [ ] All 7 required sections are present and filled
487
+ - [ ] Folder structure includes all mandatory directories
488
+ - [ ] Recommended stack tools are specific to the target dev repo's language and framework
489
+ - [ ] Config files are complete and ready to use (not snippets)
490
+ - [ ] Execution scripts include all 6 required scripts
491
+ - [ ] CI/CD strategy includes both PR gate and nightly schedule
492
+ - [ ] Definition of Done has 10+ checklist items
493
+ - [ ] No hardcoded credentials anywhere in config files
494
+ </quality_gate>
495
+
496
+ <success_criteria>
497
+ The analyzer agent has completed successfully when:
498
+
499
+ 1. **QA_ANALYSIS.md** exists at the output path with all 6 required sections populated with data specific to the analyzed repository
500
+ 2. **TEST_INVENTORY.md** exists at the output path with all 5 required sections and every test case has:
501
+ - A unique ID following the naming convention (UT-MODULE-NNN, INT-MODULE-NNN, API-RESOURCE-NNN, E2E-FLOW-NNN)
502
+ - All mandatory fields filled (7 fields for unit tests, 6 for integration, 7 for API, 5 for E2E)
503
+ - An explicit expected outcome with a concrete value -- no vague assertions
504
+ 3. **Test case count** follows pyramid distribution (60-70% unit, 10-15% integration, 20-25% API, 3-5% E2E)
505
+ 4. **QA_REPO_BLUEPRINT.md** exists at the output path (if Option 1 workflow) with all 7 required sections
506
+ 5. All artifacts are committed via `node bin/qaa-tools.cjs commit`
507
+ 6. Return to orchestrator: file paths, total test count, pyramid breakdown (unit/integration/api/e2e counts), risk count (high/medium/low)
508
+ </success_criteria>