qaa-agent 1.6.2 → 1.6.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/.claude/commands/create-test.md +164 -164
  2. package/.claude/commands/qa-audit.md +37 -37
  3. package/.claude/commands/qa-blueprint.md +54 -54
  4. package/.claude/commands/qa-fix.md +36 -36
  5. package/.claude/commands/qa-from-ticket.md +24 -24
  6. package/.claude/commands/qa-gap.md +20 -20
  7. package/.claude/commands/qa-map.md +47 -47
  8. package/.claude/commands/qa-pom.md +36 -36
  9. package/.claude/commands/qa-pr.md +23 -23
  10. package/.claude/commands/qa-pyramid.md +37 -37
  11. package/.claude/commands/qa-report.md +38 -38
  12. package/.claude/commands/qa-research.md +33 -33
  13. package/.claude/commands/qa-start.md +22 -22
  14. package/.claude/commands/qa-testid.md +19 -19
  15. package/.claude/commands/qa-validate.md +42 -42
  16. package/.claude/commands/update-test.md +58 -58
  17. package/.claude/settings.json +20 -20
  18. package/.claude/skills/qa-bug-detective/SKILL.md +122 -122
  19. package/.claude/skills/qa-learner/SKILL.md +150 -150
  20. package/.claude/skills/qa-repo-analyzer/SKILL.md +88 -88
  21. package/.claude/skills/qa-self-validator/SKILL.md +109 -109
  22. package/.claude/skills/qa-template-engine/SKILL.md +113 -113
  23. package/.claude/skills/qa-testid-injector/SKILL.md +93 -93
  24. package/.claude/skills/qa-workflow-documenter/SKILL.md +87 -87
  25. package/.mcp.json +8 -8
  26. package/CHANGELOG.md +71 -71
  27. package/CLAUDE.md +553 -553
  28. package/agents/qa-pipeline-orchestrator.md +1378 -1378
  29. package/agents/qaa-analyzer.md +524 -524
  30. package/agents/qaa-bug-detective.md +446 -446
  31. package/agents/qaa-codebase-mapper.md +935 -935
  32. package/agents/qaa-e2e-runner.md +415 -415
  33. package/agents/qaa-executor.md +651 -651
  34. package/agents/qaa-planner.md +390 -390
  35. package/agents/qaa-project-researcher.md +319 -319
  36. package/agents/qaa-scanner.md +424 -424
  37. package/agents/qaa-testid-injector.md +585 -585
  38. package/agents/qaa-validator.md +452 -452
  39. package/bin/install.cjs +198 -198
  40. package/bin/lib/commands.cjs +709 -709
  41. package/bin/lib/config.cjs +307 -307
  42. package/bin/lib/core.cjs +497 -497
  43. package/bin/lib/frontmatter.cjs +299 -299
  44. package/bin/lib/init.cjs +989 -989
  45. package/bin/lib/milestone.cjs +241 -241
  46. package/bin/lib/model-profiles.cjs +60 -60
  47. package/bin/lib/phase.cjs +911 -911
  48. package/bin/lib/roadmap.cjs +306 -306
  49. package/bin/lib/state.cjs +748 -748
  50. package/bin/lib/template.cjs +222 -222
  51. package/bin/lib/verify.cjs +842 -842
  52. package/bin/qaa-tools.cjs +607 -607
  53. package/docs/COMMANDS.md +341 -341
  54. package/docs/DEMO.md +182 -182
  55. package/docs/TESTING.md +156 -156
  56. package/package.json +41 -41
  57. package/templates/failure-classification.md +391 -391
  58. package/templates/gap-analysis.md +409 -409
  59. package/templates/pr-template.md +48 -48
  60. package/templates/qa-analysis.md +381 -381
  61. package/templates/qa-audit-report.md +465 -465
  62. package/templates/qa-repo-blueprint.md +636 -636
  63. package/templates/scan-manifest.md +312 -312
  64. package/templates/test-inventory.md +582 -582
  65. package/templates/testid-audit-report.md +354 -354
  66. package/templates/validation-report.md +243 -243
  67. package/workflows/qa-analyze.md +296 -296
  68. package/workflows/qa-from-ticket.md +536 -536
  69. package/workflows/qa-gap.md +303 -303
  70. package/workflows/qa-pr.md +389 -389
  71. package/workflows/qa-start.md +1168 -1168
  72. package/workflows/qa-testid.md +356 -356
  73. package/workflows/qa-validate.md +295 -295
@@ -1,1378 +1,1378 @@
1
- <purpose>
2
- Single orchestrator for the QA automation pipeline. Coordinates all 7 agent types (scanner, analyzer, planner, executor, validator, bug-detective, testid-injector) across 3 workflow options. Owns all pipeline state transitions -- agents never update state directly. The orchestrator sets stage status to 'running' before spawning an agent and 'complete' or 'failed' after the agent returns.
3
-
4
- Invoked by the `/qa-start` slash command (Phase 6) or directly via Task() with this file as execution_context. Accepts 0-2 repo paths: 0 paths uses cwd as dev repo (Option 1), 1 path is dev-only (Option 1), 2 paths triggers maturity scoring to determine Option 2 or 3.
5
-
6
- **Pipeline stages in order:**
7
- ```
8
- scan -> codebase-map -> analyze -> [testid-inject if frontend] -> plan -> generate -> validate -> [e2e-runner if E2E tests] -> [bug-detective if failures] -> deliver
9
- ```
10
-
11
- **Workflow options:**
12
- - Option 1: Dev-only repo -- full pipeline from scratch
13
- - Option 2: Dev + immature QA repo -- gap-fill and standardize
14
- - Option 3: Dev + mature QA repo -- surgical additions only
15
- </purpose>
16
-
17
- <required_reading>
18
- Read these files BEFORE executing any pipeline stage. Do NOT skip.
19
-
20
- - **CLAUDE.md** -- Agent pipeline stages, module boundaries, quality gates, stage transitions, auto-advance rules, agent coordination, data-testid convention. Read the full file.
21
- - **.planning/STATE.md** -- Current pipeline state. Check scan_status, analyze_status, generate_status, validate_status, deliver_status fields.
22
- - **.planning/config.json** -- Workflow configuration: auto_advance flag, parallelization flag, mode, commit_docs.
23
- </required_reading>
24
-
25
- <process>
26
-
27
- <step name="initialize">
28
- ## Step 1: Initialize Pipeline
29
-
30
- Call `qaa-tools.cjs init qa-start` to bootstrap the full workflow context.
31
-
32
- ```bash
33
- INIT_JSON=$(node bin/qaa-tools.cjs init qa-start)
34
- ```
35
-
36
- Parse the JSON to extract all required fields:
37
-
38
- ```
39
- option -- 1, 2, or 3 (workflow routing)
40
- dev_repo_path -- path to the developer repository
41
- qa_repo_path -- path to existing QA repository (null for Option 1)
42
- maturity_score -- 0-100 QA repo quality score (null for Option 1)
43
- maturity_note -- descriptive note about maturity assessment (null for Option 1)
44
- output_dir -- ".qa-output" (where agents write artifacts)
45
- date -- "YYYY-MM-DD" for branch naming and timestamps
46
-
47
- scanner_model -- model for scanner agent
48
- analyzer_model -- model for analyzer agent
49
- planner_model -- model for planner agent
50
- executor_model -- model for executor agent
51
- validator_model -- model for validator agent
52
- detective_model -- model for bug-detective agent
53
- injector_model -- model for testid-injector agent
54
-
55
- auto_advance -- persistent config flag (boolean)
56
- auto_chain_active -- ephemeral chain flag (boolean)
57
- parallelization -- parallelization config value
58
- commit_docs -- whether to commit documentation artifacts
59
- ```
60
-
61
- **Determine auto-advance mode:**
62
-
63
- ```bash
64
- IS_AUTO=false
65
-
66
- # Check persistent config flag
67
- if auto_advance is true OR auto_chain_active is true; then
68
- IS_AUTO=true
69
- node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
70
- fi
71
-
72
- # Check if --auto was passed as argument to orchestrator invocation
73
- if --auto flag was passed; then
74
- IS_AUTO=true
75
- node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
76
- fi
77
- ```
78
-
79
- **Safety: Clear stale auto-chain flag** -- if NOT in auto mode, clear the ephemeral flag to prevent a previous interrupted `--auto` run from causing unexpected auto-advance:
80
-
81
- ```bash
82
- if IS_AUTO is false:
83
- node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
84
- ```
85
-
86
- **Print initialization banner:**
87
-
88
- ```
89
- === QA Pipeline Orchestrator ===
90
- Option: {option} ({description})
91
- Dev Repo: {dev_repo_path}
92
- QA Repo: {qa_repo_path or 'N/A'}
93
- Maturity Score: {maturity_score or 'N/A'}
94
- Auto-Advance: {IS_AUTO}
95
- Date: {date}
96
- ================================
97
- ```
98
-
99
- Where `{description}` is:
100
- - Option 1: "Dev-Only -- Full Pipeline"
101
- - Option 2: "Dev + Immature QA -- Gap-Fill"
102
- - Option 3: "Dev + Mature QA -- Surgical"
103
- </step>
104
-
105
- <step name="route_by_option">
106
- ## Step 2: Route by Option
107
-
108
- Based on `option` value from init, select the stage sequence. Each option shares the same core pipeline but differs in how agents are parameterized and what artifacts they produce.
109
-
110
- **Option 1 stages:**
111
- ```
112
- scan(dev) -> codebase-map -> analyze(full) -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
113
- ```
114
- - Scanner: scan DEV repo only
115
- - Codebase Map: deep-scan codebase for testability, risk, patterns, existing tests
116
- - Analyzer: mode='full' (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
117
- - Planner: reads TEST_INVENTORY.md + QA_ANALYSIS.md + codebase map documents
118
- - Executor: generates all planned test files using codebase map for context
119
- - All stages run against DEV repo artifacts
120
-
121
- **Option 2 stages:**
122
- ```
123
- scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(gap) -> validate -> [bug-detective if failures] -> deliver
124
- ```
125
- - Scanner: scan BOTH dev_repo_path and qa_repo_path
126
- - Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
127
- - Analyzer: mode='gap' (produces GAP_ANALYSIS.md)
128
- - Planner: reads GAP_ANALYSIS.md + codebase map documents (fix broken first, then add missing, then standardize)
129
- - Executor: generates fixed test files + new test files + standardized files
130
- - All stages aware of existing QA repo structure
131
-
132
- **Option 3 stages:**
133
- ```
134
- scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective if failures] -> deliver
135
- ```
136
- - Scanner: scan BOTH dev_repo_path and qa_repo_path
137
- - Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
138
- - Analyzer: mode='gap' (produces GAP_ANALYSIS.md with thin areas only)
139
- - Planner: reads GAP_ANALYSIS.md + codebase map documents (missing tests only)
140
- - Executor: passes `skip_existing_test_ids: true` so it checks existing test files by test ID before generating -- skips tests that already exist
141
- - Only new test files are generated; existing working tests are left untouched
142
-
143
- **Shared stages across all options:**
144
- - TestID injection: conditional on `has_frontend` from scanner return
145
- - Validation: always runs on generated files
146
- - Bug detective: conditional on test failures
147
- - Deliver: always runs (branch creation, per-stage commits, push, draft PR via gh CLI)
148
- </step>
149
-
150
- <step name="execute_scan">
151
- ## Step 3: Execute Scan Stage
152
-
153
- **State update -- mark scan as running:**
154
- ```bash
155
- node bin/qaa-tools.cjs state patch --"Scan Status" running --"Status" "Scanning repository"
156
- ```
157
-
158
- **Print stage banner:**
159
- ```
160
- +------------------------------------------+
161
- | STAGE 1: Scanner |
162
- | Status: Running... |
163
- +------------------------------------------+
164
- ```
165
-
166
- **Spawn scanner agent via Task():**
167
-
168
- For **Option 1** -- scan DEV repo only:
169
- ```
170
- Task(
171
- prompt="
172
- <objective>Scan repository and produce SCAN_MANIFEST.md</objective>
173
- <execution_context>@agents/qaa-scanner.md</execution_context>
174
- <files_to_read>
175
- - CLAUDE.md
176
- </files_to_read>
177
- <parameters>
178
- dev_repo_path: {dev_repo_path}
179
- qa_repo_path: null
180
- output_path: {output_dir}/SCAN_MANIFEST.md
181
- </parameters>
182
- "
183
- )
184
- ```
185
-
186
- For **Options 2 and 3** -- scan BOTH repos:
187
- ```
188
- Task(
189
- prompt="
190
- <objective>Scan both developer and QA repositories and produce SCAN_MANIFEST.md</objective>
191
- <execution_context>@agents/qaa-scanner.md</execution_context>
192
- <files_to_read>
193
- - CLAUDE.md
194
- </files_to_read>
195
- <parameters>
196
- dev_repo_path: {dev_repo_path}
197
- qa_repo_path: {qa_repo_path}
198
- output_path: {output_dir}/SCAN_MANIFEST.md
199
- </parameters>
200
- "
201
- )
202
- ```
203
-
204
- **Parse scanner return:**
205
-
206
- Expected return structure:
207
- ```
208
- SCANNER_COMPLETE:
209
- file_path: ".qa-output/SCAN_MANIFEST.md"
210
- decision: PROCEED | STOP
211
- has_frontend: true | false
212
- detection_confidence: HIGH | MEDIUM | LOW
213
- ```
214
-
215
- **Handle decision field:**
216
-
217
- - If `decision` is `STOP`:
218
- ```bash
219
- node bin/qaa-tools.cjs state patch --"Scan Status" failed --"Status" "Pipeline stopped: Scanner returned STOP"
220
- ```
221
- Print failure banner and STOP PIPELINE ENTIRELY. Do NOT proceed to any further stage.
222
-
223
- - If `decision` is `PROCEED`:
224
- ```bash
225
- node bin/qaa-tools.cjs state patch --"Scan Status" complete
226
- ```
227
- Capture `has_frontend` for testid-injector conditional.
228
- Capture `detection_confidence` for checkpoint handling.
229
-
230
- **Handle scanner checkpoint -- framework detection uncertain:**
231
-
232
- If `detection_confidence` is `LOW`:
233
- - If `IS_AUTO` is true: Auto-approve with most likely framework (SAFE checkpoint). Log: "Auto-approved: Scanner framework detection (LOW confidence, selected most likely framework)". Continue pipeline.
234
- - If `IS_AUTO` is false: Present the detection details to user. Wait for confirmation. On user response, spawn fresh continuation agent with user's framework choice.
235
- </step>
236
-
237
- <step name="execute_codebase_map">
238
- ## Step 4: Execute Codebase Map Stage
239
-
240
- Deep-scan the codebase to produce QA-oriented documents that all downstream agents consume. This gives the pipeline full knowledge of the codebase structure, testability, risk areas, code patterns, and existing test coverage before any analysis or generation happens.
241
-
242
- **State update -- mark codebase-map as running:**
243
- ```bash
244
- node bin/qaa-tools.cjs state patch --"Map Status" running --"Status" "Mapping codebase"
245
- ```
246
-
247
- **Print stage banner:**
248
- ```
249
- +------------------------------------------+
250
- | STAGE 2: Codebase Map |
251
- | Status: Running... |
252
- +------------------------------------------+
253
- ```
254
-
255
- **Spawn 4 mapper agents in parallel (one per focus area):**
256
-
257
- ```
258
- For each focus in [testability, risk, patterns, existing-tests]:
259
-
260
- Task(
261
- prompt="
262
- <objective>Analyze codebase for QA purposes. Focus area: {focus}. Write documents to {output_dir}/codebase/.</objective>
263
- <execution_context>@agents/qaa-codebase-mapper.md</execution_context>
264
- <files_to_read>
265
- - CLAUDE.md
266
- - {output_dir}/SCAN_MANIFEST.md
267
- </files_to_read>
268
- <parameters>
269
- focus: {focus}
270
- dev_repo_path: {dev_repo_path}
271
- output_dir: {output_dir}/codebase
272
- </parameters>
273
- "
274
- )
275
- ```
276
-
277
- All 4 agents can run simultaneously -- they read the codebase but write to separate files.
278
-
279
- **Parse mapper returns and verify outputs exist:**
280
-
281
- Expected files after all 4 complete:
282
- ```
283
- {output_dir}/codebase/TESTABILITY.md
284
- {output_dir}/codebase/TEST_SURFACE.md
285
- {output_dir}/codebase/RISK_MAP.md
286
- {output_dir}/codebase/CRITICAL_PATHS.md
287
- {output_dir}/codebase/CODE_PATTERNS.md
288
- {output_dir}/codebase/API_CONTRACTS.md
289
- {output_dir}/codebase/TEST_ASSESSMENT.md
290
- {output_dir}/codebase/COVERAGE_GAPS.md
291
- ```
292
-
293
- Verify at least 4 of 8 files exist (one per focus area produces 2 files). If any focus area produced 0 files, log warning but continue -- the downstream agents treat these as optional reads.
294
-
295
- **State update -- mark codebase-map as complete:**
296
- ```bash
297
- node bin/qaa-tools.cjs state patch --"Map Status" complete --"Status" "Codebase mapped"
298
- ```
299
-
300
- Print: "Codebase map complete. {N} documents produced in {output_dir}/codebase/."
301
-
302
- **Set codebase_map_dir for downstream stages:**
303
- ```
304
- codebase_map_dir = "{output_dir}/codebase"
305
- ```
306
- </step>
307
-
308
- <step name="execute_analyze">
309
- ## Step 5: Execute Analyze Stage
310
-
311
- **State update -- mark analyze as running:**
312
- ```bash
313
- node bin/qaa-tools.cjs state patch --"Analyze Status" running --"Status" "Analyzing repository"
314
- ```
315
-
316
- **Print stage banner:**
317
- ```
318
- +------------------------------------------+
319
- | STAGE 2: Analyzer |
320
- | Status: Running... |
321
- +------------------------------------------+
322
- ```
323
-
324
- **Determine analyzer mode based on option:**
325
- - Option 1: `mode = 'full'` (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
326
- - Options 2 and 3: `mode = 'gap'` (produces GAP_ANALYSIS.md)
327
-
328
- **Spawn analyzer agent via Task():**
329
- ```
330
- Task(
331
- prompt="
332
- <objective>Analyze scanned repository and produce analysis artifacts</objective>
333
- <execution_context>@agents/qaa-analyzer.md</execution_context>
334
- <files_to_read>
335
- - {output_dir}/SCAN_MANIFEST.md
336
- - CLAUDE.md
337
- - {codebase_map_dir}/RISK_MAP.md (if exists)
338
- - {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
339
- - {codebase_map_dir}/TEST_ASSESSMENT.md (if exists)
340
- - {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
341
- </files_to_read>
342
- <parameters>
343
- mode: {mode}
344
- workflow_option: {option}
345
- dev_repo_path: {dev_repo_path}
346
- qa_repo_path: {qa_repo_path or null}
347
- output_path: {output_dir}/
348
- codebase_map_dir: {codebase_map_dir}
349
- </parameters>
350
- "
351
- )
352
- ```
353
-
354
- **Parse analyzer return:**
355
-
356
- Expected return structure:
357
- ```
358
- ANALYZER_COMPLETE:
359
- files_produced: [...]
360
- total_test_count: N
361
- pyramid_breakdown: {unit: N, integration: N, api: N, e2e: N}
362
- risk_count: {high: N, medium: N, low: N}
363
- commit_hash: "..."
364
- ```
365
-
366
- Capture `files_produced`, `total_test_count`, `pyramid_breakdown` for downstream stages.
367
-
368
- **Handle analyzer checkpoint -- assumptions review:**
369
-
370
- If the analyzer returns a checkpoint with assumptions:
371
- - If `IS_AUTO` is true: Auto-approve all assumptions (SAFE checkpoint). Log: "Auto-approved: Analyzer assumptions". Continue pipeline.
372
- - If `IS_AUTO` is false: Present assumptions to user for review. Wait for confirmation or corrections. On user response, spawn fresh continuation agent incorporating any corrections.
373
-
374
- **State update -- mark analyze as complete:**
375
- ```bash
376
- node bin/qaa-tools.cjs state patch --"Analyze Status" complete
377
- ```
378
-
379
- Print completion message: "Analysis complete. {total_test_count} test cases identified. Pyramid: {pyramid_breakdown}."
380
- </step>
381
-
382
- <step name="execute_testid_inject">
383
- ## Step 5: Execute TestID Injection Stage (Conditional)
384
-
385
- **Condition:** Only execute if `has_frontend` is `true` from scanner return (Step 3).
386
-
387
- **If `has_frontend` is false:**
388
- Print: "Skipping TestID injection (no frontend detected)." and proceed directly to Step 6 (Plan).
389
-
390
- **If `has_frontend` is true:**
391
-
392
- **State update:**
393
- ```bash
394
- node bin/qaa-tools.cjs state patch --"Status" "Injecting test IDs into frontend components"
395
- ```
396
-
397
- **Print stage banner:**
398
- ```
399
- +------------------------------------------+
400
- | STAGE 3: TestID Injector |
401
- | Status: Running... |
402
- +------------------------------------------+
403
- ```
404
-
405
- **Spawn testid-injector agent via Task():**
406
- ```
407
- Task(
408
- prompt="
409
- <objective>Audit and inject data-testid attributes into frontend components</objective>
410
- <execution_context>@agents/qaa-testid-injector.md</execution_context>
411
- <files_to_read>
412
- - {output_dir}/SCAN_MANIFEST.md
413
- - CLAUDE.md
414
- </files_to_read>
415
- <parameters>
416
- dev_repo_path: {dev_repo_path}
417
- output_path: {output_dir}/TESTID_AUDIT_REPORT.md
418
- </parameters>
419
- "
420
- )
421
- ```
422
-
423
- **Parse return:**
424
-
425
- Check for `INJECTOR_COMPLETE` vs `INJECTOR_SKIPPED`:
426
-
427
- If `INJECTOR_COMPLETE`:
428
- ```
429
- INJECTOR_COMPLETE:
430
- report_path: "..."
431
- coverage_before: N%
432
- coverage_after: N%
433
- elements_injected: N
434
- ...
435
- ```
436
- Log: "TestID injection complete. Coverage: {coverage_before}% -> {coverage_after}%. {elements_injected} elements injected."
437
-
438
- If `INJECTOR_SKIPPED`:
439
- ```
440
- INJECTOR_SKIPPED:
441
- reason: "..."
442
- action: "..."
443
- ```
444
- Log the reason and continue pipeline.
445
-
446
- **Handle injector checkpoint -- audit review:**
447
- - If `IS_AUTO` is true: Auto-approve P0-only injection (SAFE checkpoint). Log: "Auto-approved: TestID injection (P0 elements only)". Continue pipeline.
448
- - If `IS_AUTO` is false: Present audit report to user. Wait for approval, element selection, or rejection. On user response, spawn fresh continuation agent with user's approved elements.
449
- </step>
450
-
451
- <step name="execute_plan">
452
- ## Step 6: Execute Plan Stage
453
-
454
- **State update -- mark generation as running (planning is part of generate):**
455
- ```bash
456
- node bin/qaa-tools.cjs state patch --"Generate Status" running --"Status" "Planning test generation"
457
- ```
458
-
459
- **Print stage banner:**
460
- ```
461
- +------------------------------------------+
462
- | STAGE 4: Planner |
463
- | Status: Running... |
464
- +------------------------------------------+
465
- ```
466
-
467
- **Determine planner input based on option:**
468
- - Option 1: Input from `{output_dir}/TEST_INVENTORY.md` + `{output_dir}/QA_ANALYSIS.md`
469
- - Options 2 and 3: Input from `{output_dir}/GAP_ANALYSIS.md`
470
-
471
- **Spawn planner agent via Task():**
472
- ```
473
- Task(
474
- prompt="
475
- <objective>Create test generation plan with task breakdown and dependencies</objective>
476
- <execution_context>@agents/qaa-planner.md</execution_context>
477
- <files_to_read>
478
- - {input files based on option -- see above}
479
- - CLAUDE.md
480
- - {codebase_map_dir}/TESTABILITY.md (if exists)
481
- - {codebase_map_dir}/TEST_SURFACE.md (if exists)
482
- - {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
483
- - {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
484
- </files_to_read>
485
- <parameters>
486
- workflow_option: {option}
487
- output_path: {output_dir}/GENERATION_PLAN.md
488
- codebase_map_dir: {codebase_map_dir}
489
- </parameters>
490
- "
491
- )
492
- ```
493
-
494
- **Parse planner return:**
495
-
496
- Expected return structure:
497
- ```
498
- PLANNER_COMPLETE:
499
- file_path: "..."
500
- total_tasks: N
501
- total_files: N
502
- feature_count: N
503
- dependency_depth: N
504
- test_case_count: N
505
- commit_hash: "..."
506
- ```
507
-
508
- Capture `total_tasks`, `total_files`, `feature_count` for executor stage and pipeline summary.
509
-
510
- Print: "Plan complete. {total_tasks} tasks, {total_files} files planned across {feature_count} features."
511
- </step>
512
-
513
- <step name="execute_generate">
514
- ## Step 7: Execute Generate Stage (FLOW-04 -- Wave-based Parallel Execution)
515
-
516
- State update continues from planning (already set to `running` in Step 6).
517
-
518
- **Print stage banner:**
519
- ```
520
- +------------------------------------------+
521
- | STAGE 5: Executor |
522
- | Generating {total_files} test files |
523
- | Status: Running... |
524
- +------------------------------------------+
525
- ```
526
-
527
- **FLOW-04: Wave-based parallel execution:**
528
-
529
- Check if planner created multiple independent feature groups. If `feature_count > 1` AND `parallelization` config allows parallel execution:
530
-
531
- **Parallel execution (when feature_count > 1 and parallelization enabled):**
532
-
533
- For each independent feature group from the generation plan, spawn a separate executor agent:
534
- ```
535
- Task(
536
- prompt="
537
- <objective>Generate test files for {feature} feature</objective>
538
- <execution_context>@agents/qaa-executor.md</execution_context>
539
- <files_to_read>
540
- - {output_dir}/GENERATION_PLAN.md
541
- - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
542
- - CLAUDE.md
543
- - {codebase_map_dir}/CODE_PATTERNS.md (if exists)
544
- - {codebase_map_dir}/API_CONTRACTS.md (if exists)
545
- - {codebase_map_dir}/TEST_SURFACE.md (if exists)
546
- </files_to_read>
547
- <parameters>
548
- workflow_option: {option}
549
- feature_group: {feature}
550
- dev_repo_path: {dev_repo_path}
551
- qa_repo_path: {qa_repo_path or null}
552
- output_path: {output_dir}/
553
- codebase_map_dir: {codebase_map_dir}
554
- </parameters>
555
- "
556
- )
557
- ```
558
-
559
- Multiple Task() calls can be issued simultaneously for independent feature groups. Each executor handles one feature group and commits its files independently.
560
-
561
- **Sequential execution (when feature_count == 1 or parallelization disabled):**
562
-
563
- Spawn a single executor agent covering all tasks:
564
- ```
565
- Task(
566
- prompt="
567
- <objective>Generate all test files from generation plan</objective>
568
- <execution_context>@agents/qaa-executor.md</execution_context>
569
- <files_to_read>
570
- - {output_dir}/GENERATION_PLAN.md
571
- - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
572
- - CLAUDE.md
573
- - {codebase_map_dir}/CODE_PATTERNS.md (if exists)
574
- - {codebase_map_dir}/API_CONTRACTS.md (if exists)
575
- - {codebase_map_dir}/TEST_SURFACE.md (if exists)
576
- </files_to_read>
577
- <parameters>
578
- workflow_option: {option}
579
- dev_repo_path: {dev_repo_path}
580
- qa_repo_path: {qa_repo_path or null}
581
- output_path: {output_dir}/
582
- codebase_map_dir: {codebase_map_dir}
583
- </parameters>
584
- "
585
- )
586
- ```
587
-
588
- **Option 3 specific -- skip existing tests:**
589
-
590
- For Option 3, pass `skip_existing_test_ids: true` to the executor so it checks existing test files by test ID before generating. If a test ID already exists in the QA repo, skip generating that test case.
591
-
592
- ```
593
- <parameters>
594
- workflow_option: 3
595
- skip_existing_test_ids: true
596
- dev_repo_path: {dev_repo_path}
597
- qa_repo_path: {qa_repo_path}
598
- output_path: {output_dir}/
599
- </parameters>
600
- ```
601
-
602
- **Parse executor return:**
603
-
604
- Expected return structure:
605
- ```
606
- EXECUTOR_COMPLETE:
607
- files_created: [{path, type}, ...]
608
- total_files: N
609
- commit_count: N
610
- features_covered: [...]
611
- test_case_count: N
612
- ```
613
-
614
- Capture `files_created`, `total_files`, `commit_count` for validation stage and pipeline summary.
615
-
616
- **State update -- mark generate as complete:**
617
- ```bash
618
- node bin/qaa-tools.cjs state patch --"Generate Status" complete --"Status" "Test generation complete"
619
- ```
620
-
621
- Print: "Generation complete. {total_files} files created across {features_covered.length} features. {commit_count} commits."
622
- </step>
623
-
624
- <step name="execute_validate">
625
- ## Step 8: Execute Validate Stage
626
-
627
- **State update -- mark validate as running:**
628
- ```bash
629
- node bin/qaa-tools.cjs state patch --"Validate Status" running --"Status" "Validating generated tests"
630
- ```
631
-
632
- **Print stage banner:**
633
- ```
634
- +------------------------------------------+
635
- | STAGE 6: Validator |
636
- | Validating {total_files} test files |
637
- | Status: Running... |
638
- +------------------------------------------+
639
- ```
640
-
641
- **Spawn validator agent via Task():**
642
- ```
643
- Task(
644
- prompt="
645
- <objective>Run 4-layer validation on all generated test files</objective>
646
- <execution_context>@agents/qaa-validator.md</execution_context>
647
- <files_to_read>
648
- - {list all generated test files from executor return -- files_created paths}
649
- - {output_dir}/GENERATION_PLAN.md
650
- - CLAUDE.md
651
- </files_to_read>
652
- <parameters>
653
- mode: validation
654
- output_path: {output_dir}/VALIDATION_REPORT.md
655
- </parameters>
656
- "
657
- )
658
- ```
659
-
660
- **Parse validator return:**
661
-
662
- Expected return structure:
663
- ```
664
- VALIDATOR_COMPLETE:
665
- report_path: "..."
666
- overall_status: PASS | PASS_WITH_WARNINGS | FAIL
667
- confidence: HIGH | MEDIUM | LOW
668
- layers_summary: {syntax, structure, dependencies, logic}
669
- fix_loops_used: N
670
- issues_found: N
671
- issues_fixed: N
672
- unresolved_count: N
673
- ```
674
-
675
- **RISKY CHECKPOINT -- Validator escalation (FLOW-07):**
676
-
677
- If `unresolved_count > 0` after max fix loops (3):
678
- - **ALWAYS pause, even in auto mode** (this is a locked decision from CONTEXT.md)
679
- - Present unresolved issues to user with full details from VALIDATION_REPORT.md
680
- - Wait for user decision:
681
- - `"approve-with-warnings"`: Accept the validation with warnings. Set Validate Status to complete. Continue to deliver.
682
- - `"abort"`: Set Validate Status to failed. STOP PIPELINE ENTIRELY.
683
- - Manual guidance: User provides specific fix instructions. Spawn fresh continuation agent to apply fixes and re-validate.
684
-
685
- If `overall_status` is `PASS` or `PASS_WITH_WARNINGS` (and unresolved_count is 0):
686
- ```bash
687
- node bin/qaa-tools.cjs state patch --"Validate Status" complete --"Status" "Validation passed"
688
- ```
689
-
690
- Print: "Validation complete. Status: {overall_status}. Confidence: {confidence}. {issues_found} issues found, {issues_fixed} fixed, {unresolved_count} unresolved."
691
- </step>
692
-
693
- <step name="execute_e2e_runner">
694
- ## Step 9: Execute E2E Runner Stage (Conditional)
695
-
696
- **Condition:** Only execute if E2E test files were generated. Check if `files_created` from executor return contains any `*.e2e.spec.*` or `*.e2e.cy.*` files. Also requires a live application URL.
697
-
698
- If no E2E files were generated:
699
- Print: "Skipping E2E Runner (no E2E test files generated)." and proceed to Step 10.
700
-
701
- **State update:**
702
- ```bash
703
- node bin/qaa-tools.cjs state patch --"Status" "Running E2E tests against live app"
704
- ```
705
-
706
- **Print stage banner:**
707
- ```
708
- +------------------------------------------+
709
- | STAGE 7: E2E Runner |
710
- | Running tests against live application |
711
- | Status: Running... |
712
- +------------------------------------------+
713
- ```
714
-
715
- **Spawn e2e-runner agent via Task():**
716
- ```
717
- Task(
718
- prompt="
719
- <objective>Run E2E tests against live application, capture real locators, fix mismatches, loop until pass</objective>
720
- <execution_context>@agents/qaa-e2e-runner.md</execution_context>
721
- <files_to_read>
722
- - CLAUDE.md
723
- - {e2e_test_files from executor return}
724
- - {pom_files from executor return}
725
- </files_to_read>
726
- <parameters>
727
- app_url: {app_url if provided, otherwise auto-detect}
728
- output_dir: {output_dir}
729
- dev_repo_path: {dev_repo_path}
730
- </parameters>
731
- "
732
- )
733
- ```
734
-
735
- **Parse e2e-runner return:**
736
-
737
- Expected return structure:
738
- ```
739
- E2E_RUNNER_COMPLETE:
740
- app_url: "..."
741
- total_tests: N
742
- passed: N
743
- failed: N
744
- locator_fixes: N
745
- app_bugs_found: N
746
- fix_loops_used: N
747
- report_path: "..."
748
- screenshots: [...]
749
- ```
750
-
751
- Capture `passed`, `failed`, `app_bugs_found`, `locator_fixes` for pipeline summary.
752
-
753
- If `app_bugs_found > 0`:
754
- Print: "E2E Runner found {app_bugs_found} application bug(s). See E2E_RUN_REPORT.md for details with screenshots."
755
-
756
- If `failed > 0` and `app_bugs_found == 0`:
757
- Print: "E2E Runner: {failed} test(s) still failing after {fix_loops_used} fix loops. Proceeding to Bug Detective for classification."
758
-
759
- Print: "E2E run complete. {passed}/{total_tests} passed. {locator_fixes} locators fixed. {app_bugs_found} app bugs found."
760
- </step>
761
-
762
- <step name="execute_bug_detective">
763
- ## Step 10: Execute Bug Detective Stage (Conditional)
764
-
765
- **Condition:** Only execute if the validator reports test failures. Check:
766
- - `overall_status === 'FAIL'` in validator return, OR
767
- - Generated tests have runtime failures that need classification
768
-
769
- If the validator reports `PASS` or `PASS_WITH_WARNINGS` and there are no test execution failures, skip this stage entirely.
770
-
771
- **If no failures to classify:**
772
- Print: "Skipping Bug Detective (no test failures detected)." and proceed directly to Step 10 (Deliver).
773
-
774
- **If failures need classification:**
775
-
776
- **State update:**
777
- ```bash
778
- node bin/qaa-tools.cjs state patch --"Status" "Classifying test failures"
779
- ```
780
-
781
- **Print stage banner:**
782
- ```
783
- +------------------------------------------+
784
- | STAGE 7: Bug Detective |
785
- | Status: Running... |
786
- +------------------------------------------+
787
- ```
788
-
789
- **Spawn bug-detective agent via Task():**
790
- ```
791
- Task(
792
- prompt="
793
- <objective>Classify test failures and attempt auto-fixes for test errors</objective>
794
- <execution_context>@agents/qaa-bug-detective.md</execution_context>
795
- <files_to_read>
796
- - {test execution results -- from validator or direct test run}
797
- - {failing test source files -- paths from executor return}
798
- - CLAUDE.md
799
- </files_to_read>
800
- <parameters>
801
- output_path: {output_dir}/FAILURE_CLASSIFICATION_REPORT.md
802
- </parameters>
803
- "
804
- )
805
- ```
806
-
807
- **Parse bug-detective return:**
808
-
809
- Expected return structure:
810
- ```
811
- DETECTIVE_COMPLETE:
812
- report_path: "..."
813
- total_failures: N
814
- classification_breakdown: {app_bug: N, test_error: N, env_issue: N, inconclusive: N}
815
- auto_fixes_applied: N
816
- auto_fixes_verified: N
817
- commit_hash: "..."
818
- ```
819
-
820
- **RISKY CHECKPOINT -- Application bugs detected:**
821
-
822
- If `classification_breakdown.app_bug > 0`:
823
- - **ALWAYS pause, even in auto mode** (locked decision -- application bugs require developer action)
824
- - Present APPLICATION BUG classifications to user with full evidence from FAILURE_CLASSIFICATION_REPORT.md
825
- - These are genuine bugs in the application code discovered during test execution
826
- - The bug detective never touches application code -- it only reports
827
- - User must review and decide how to proceed:
828
- - Acknowledge bugs and continue pipeline (bugs will be in the PR description for developer attention)
829
- - Abort pipeline to fix bugs first
830
-
831
- Print: "Bug Detective complete. {total_failures} failures classified: {app_bug} APP BUG, {test_error} TEST ERROR, {env_issue} ENV ISSUE, {inconclusive} INCONCLUSIVE. {auto_fixes_applied} auto-fixes applied."
832
- </step>
833
-
834
- <step name="execute_deliver">
835
- ## Step 10: Execute Deliver Stage
836
-
837
- **State update -- mark deliver as running:**
838
- ```bash
839
- node bin/qaa-tools.cjs state patch --"Deliver Status" running --"Status" "Preparing delivery"
840
- ```
841
-
842
- **Print stage banner:**
843
- ```
844
- +------------------------------------------+
845
- | STAGE 8: Deliver |
846
- | Status: Running... |
847
- +------------------------------------------+
848
- ```
849
-
850
- ### Sub-step 1: Pre-flight checks
851
-
852
- Before attempting branch creation or PR, verify that the required tools are available and authenticated.
853
-
854
- **Check for git remote:**
855
- ```bash
856
- REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "")
857
- ```
858
-
859
- If `REMOTE_URL` is empty:
860
- - Print error: `"No git remote found. Artifacts committed locally but PR creation skipped."`
861
- - Set `LOCAL_ONLY=true`
862
- - Skip Sub-steps 6, 7, 8, 9 (push and PR creation). Still execute Sub-steps 2-5 (branch creation and commits) so artifacts are organized on a local branch.
863
-
864
- **Check for gh CLI authentication:**
865
- ```bash
866
- gh auth status 2>/dev/null
867
- ```
868
-
869
- If `gh auth status` fails (non-zero exit code):
870
- - Print error: `"gh CLI not authenticated. Run 'gh auth login' first. Artifacts committed locally."`
871
- - Set `LOCAL_ONLY=true`
872
- - Same skip behavior as above.
873
-
874
- If both checks pass, set `LOCAL_ONLY=false`.
875
-
876
- ### Sub-step 2: Derive project name
877
-
878
- Read `dev_repo_path` from init context (available from Step 1).
879
-
880
- **Attempt to read from package.json:**
881
- ```bash
882
- PROJECT_NAME=$(node -e "try { const p = require('${dev_repo_path}/package.json'); console.log(p.name || ''); } catch { console.log(''); }")
883
- ```
884
-
885
- **Fallback to directory basename:**
886
- ```bash
887
- if [ -z "$PROJECT_NAME" ]; then
888
- PROJECT_NAME=$(basename "${dev_repo_path}")
889
- fi
890
- ```
891
-
892
- **Sanitize for branch naming (lowercase, alphanumeric and hyphens only):**
893
- ```bash
894
- PROJECT_NAME=$(echo "$PROJECT_NAME" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//' | sed 's/-$//')
895
- ```
896
-
897
- ### Sub-step 3: Detect default branch
898
-
899
- ```bash
900
- DEFAULT_BRANCH=$(gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name' 2>/dev/null || echo "main")
901
- ```
902
-
903
- If `gh repo view` fails (e.g., no remote or gh not authenticated), fall back to `"main"`.
904
-
905
- ### Sub-step 4: Create feature branch
906
-
907
- **Branch name:** `qa/auto-{PROJECT_NAME}-{date}` where `date` comes from init context.
908
-
909
- **Handle collision:** If branch already exists locally or remotely, append a numeric suffix:
910
- ```bash
911
- BRANCH="qa/auto-${PROJECT_NAME}-${date}"
912
-
913
- # Check if branch exists locally or remotely
914
- if git rev-parse --verify "$BRANCH" 2>/dev/null || git rev-parse --verify "origin/$BRANCH" 2>/dev/null; then
915
- SUFFIX=2
916
- while git rev-parse --verify "${BRANCH}-${SUFFIX}" 2>/dev/null || git rev-parse --verify "origin/${BRANCH}-${SUFFIX}" 2>/dev/null; do
917
- SUFFIX=$((SUFFIX + 1))
918
- done
919
- BRANCH="${BRANCH}-${SUFFIX}"
920
- fi
921
- ```
922
-
923
- **Create from default branch:**
924
- ```bash
925
- git checkout -b "$BRANCH" "$DEFAULT_BRANCH"
926
- ```
927
-
928
- If branch creation fails, print error and fall through to local-only commit on current branch.
929
-
930
- ### Sub-step 5: Per-stage atomic commits
931
-
932
- For each pipeline stage that produced artifacts, commit using `qaa-tools.cjs commit`. Check file existence before each commit -- skip stages that did not produce artifacts.
933
-
934
- **Scanner:**
935
- ```bash
936
- if [ -f "${output_dir}/SCAN_MANIFEST.md" ]; then
937
- node bin/qaa-tools.cjs commit "qa(scanner): produce SCAN_MANIFEST.md for ${PROJECT_NAME}" --files ${output_dir}/SCAN_MANIFEST.md
938
- fi
939
- ```
940
-
941
- **Analyzer (Option 1):**
942
- ```bash
943
- if [ -f "${output_dir}/QA_ANALYSIS.md" ]; then
944
- ANALYZER_FILES="${output_dir}/QA_ANALYSIS.md ${output_dir}/TEST_INVENTORY.md"
945
- if [ -f "${output_dir}/QA_REPO_BLUEPRINT.md" ]; then
946
- ANALYZER_FILES="${ANALYZER_FILES} ${output_dir}/QA_REPO_BLUEPRINT.md"
947
- fi
948
- node bin/qaa-tools.cjs commit "qa(analyzer): produce QA_ANALYSIS.md and TEST_INVENTORY.md" --files ${ANALYZER_FILES}
949
- fi
950
- ```
951
-
952
- **Analyzer (Option 2/3):**
953
- ```bash
954
- if [ -f "${output_dir}/GAP_ANALYSIS.md" ]; then
955
- node bin/qaa-tools.cjs commit "qa(analyzer): produce GAP_ANALYSIS.md" --files ${output_dir}/GAP_ANALYSIS.md
956
- fi
957
- ```
958
-
959
- **TestID Injector (if ran):**
960
- ```bash
961
- if [ -f "${output_dir}/TESTID_AUDIT_REPORT.md" ]; then
962
- node bin/qaa-tools.cjs commit "qa(testid-injector): inject ${injected_count} data-testid attributes across ${component_count} components" --files ${output_dir}/TESTID_AUDIT_REPORT.md ${modified_source_files}
963
- fi
964
- ```
965
-
966
- Where `injected_count` and `component_count` are captured from the testid-injector return in Step 5, and `modified_source_files` are the frontend source files that were modified.
967
-
968
- **Executor:**
969
- ```bash
970
- if [ -n "${generated_file_paths}" ]; then
971
- node bin/qaa-tools.cjs commit "qa(executor): generate ${total_files} test files with POMs and fixtures" --files ${generated_file_paths}
972
- fi
973
- ```
974
-
975
- Where `total_files` and `generated_file_paths` are captured from the executor return in Step 7.
976
-
977
- **Validator:**
978
- ```bash
979
- if [ -f "${output_dir}/VALIDATION_REPORT.md" ]; then
980
- node bin/qaa-tools.cjs commit "qa(validator): validate generated tests - ${overall_status} with ${confidence} confidence" --files ${output_dir}/VALIDATION_REPORT.md
981
- fi
982
- ```
983
-
984
- Where `overall_status` and `confidence` are captured from the validator return in Step 8.
985
-
986
- **Bug Detective (if ran):**
987
- ```bash
988
- if [ -f "${output_dir}/FAILURE_CLASSIFICATION_REPORT.md" ]; then
989
- node bin/qaa-tools.cjs commit "qa(bug-detective): classify ${total_failures} failures - ${classification_summary}" --files ${output_dir}/FAILURE_CLASSIFICATION_REPORT.md
990
- fi
991
- ```
992
-
993
- Where `total_failures` and `classification_summary` (e.g., "2 APP BUG, 2 TEST ERROR, 1 ENV ISSUE") are captured from the bug-detective return in Step 9.
994
-
995
- ### Sub-step 6: Push branch
996
-
997
- If `LOCAL_ONLY` is true, skip this sub-step.
998
-
999
- ```bash
1000
- git push -u origin "$BRANCH"
1001
- ```
1002
-
1003
- If push fails:
1004
- - Print error: `"Push failed: {error_message}. Artifacts committed locally on branch ${BRANCH}."`
1005
- - Set `LOCAL_ONLY=true` (skip PR creation but keep local commits).
1006
- - Do NOT set deliver status to failed -- artifacts are committed locally.
1007
-
1008
- ### Sub-step 7: Build PR body
1009
-
1010
- If `LOCAL_ONLY` is true, skip this sub-step.
1011
-
1012
- **Read the PR template:**
1013
- ```bash
1014
- PR_BODY=$(cat templates/pr-template.md)
1015
- ```
1016
-
1017
- **Replace all `{placeholder}` tokens with actual values collected during pipeline execution:**
1018
-
1019
- - `{architecture_type}` -- from QA_ANALYSIS.md architecture overview section (or init context if available). Example: "Next.js full-stack application"
1020
- - `{framework}` -- from QA_ANALYSIS.md or SCAN_MANIFEST.md framework detection. Example: "Playwright"
1021
- - `{risk_summary}` -- from QA_ANALYSIS.md risk assessment counts. Example: "3 HIGH, 5 MEDIUM, 2 LOW"
1022
- - `{unit_count}` -- from `pyramid_breakdown.unit` captured in Step 4 (analyzer return)
1023
- - `{integration_count}` -- from `pyramid_breakdown.integration`
1024
- - `{api_count}` -- from `pyramid_breakdown.api`
1025
- - `{e2e_count}` -- from `pyramid_breakdown.e2e`
1026
- - `{total_count}` -- from `total_test_count` captured in Step 4
1027
- - `{modules_covered}` -- count of modules/files with at least one test case in TEST_INVENTORY.md
1028
- - `{coverage_estimate}` -- estimated coverage percentage from QA_ANALYSIS.md recommended testing pyramid section
1029
- - `{validation_result}` -- from `overall_status` captured in Step 8 (validator return). PASS, PASS_WITH_WARNINGS, or FAIL
1030
- - `{confidence}` -- from `confidence` captured in Step 8. HIGH, MEDIUM, or LOW
1031
- - `{fix_loops_used}` -- from `fix_loops_used` captured in Step 8. Number 0-3
1032
- - `{issues_found}` -- from `issues_found` captured in Step 8
1033
- - `{issues_fixed}` -- from `issues_fixed` captured in Step 8
1034
- - `{file_list}` -- if total generated files <= 50, list each file as `- {path}`. If > 50, use summary: `{N} files across {M} directories`
1035
-
1036
- All replacements use simple string substitution. The PR body must be in English.
1037
-
1038
- ### Sub-step 8: Create draft PR
1039
-
1040
- If `LOCAL_ONLY` is true, skip this sub-step.
1041
-
1042
- ```bash
1043
- PR_URL=$(gh pr create \
1044
- --draft \
1045
- --title "qa: automated test suite for ${PROJECT_NAME}" \
1046
- --body "${PR_BODY}" \
1047
- --label "qa-automation" \
1048
- --label "auto-generated" \
1049
- --assignee "@me" 2>&1)
1050
- ```
1051
-
1052
- **Do NOT pass `--base` flag.** Let gh auto-detect the default branch to avoid errors when the default branch is not named "main".
1053
-
1054
- **On success:** Capture the PR URL from stdout. The URL is the last line of `gh pr create` output.
1055
-
1056
- **On failure:**
1057
- - Print error: `"PR creation failed: ${PR_URL}. Artifacts remain on branch ${BRANCH}."`
1058
- - Set deliver status to failed:
1059
- ```bash
1060
- node bin/qaa-tools.cjs state patch --"Deliver Status" failed --"Status" "Deliver failed: PR creation error"
1061
- ```
1062
- - Do NOT stop the pipeline -- artifacts are committed and pushed. The QA engineer can create the PR manually.
1063
-
1064
- ### Sub-step 9: Print PR URL
1065
-
1066
- If PR was created successfully:
1067
- ```
1068
- PR created: ${PR_URL}
1069
- ```
1070
-
1071
- If `LOCAL_ONLY` is true:
1072
- ```
1073
- PR: not created (local-only mode). Artifacts committed on branch: ${BRANCH}
1074
- ```
1075
-
1076
- Store `PR_URL` (or the local-only message) for inclusion in the pipeline summary banner.
1077
-
1078
- **State update -- mark deliver as complete (on success):**
1079
- ```bash
1080
- node bin/qaa-tools.cjs state patch --"Deliver Status" complete --"Status" "Pipeline complete"
1081
- ```
1082
-
1083
- **Clear auto-chain flag at pipeline completion:**
1084
- ```bash
1085
- node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
1086
- ```
1087
- </step>
1088
-
1089
- </process>
1090
-
1091
- <auto_advance>
1092
- ## Auto-Advance Mode
1093
-
1094
- Auto-advance is enabled when ANY of these is true:
1095
- - config.json `workflow.auto_advance = true` (persistent user preference)
1096
- - `--auto` flag passed to orchestrator invocation (per-run override)
1097
- - `workflow._auto_chain_active = true` in config (ephemeral chain flag from ongoing auto run)
1098
-
1099
- ### Behavior in Auto Mode
1100
-
1101
- **Safe checkpoints are auto-approved.** The pipeline continues without pausing. A log message records the auto-approval:
1102
- ```
1103
- Auto-approved: {checkpoint_description}
1104
- ```
1105
-
1106
- **Risky checkpoints ALWAYS pause.** Even in auto mode, the pipeline stops and presents the checkpoint to the user. This is a locked decision -- unresolved validation issues and application bugs require human judgment.
1107
-
1108
- **Full progress banners shown** even in auto mode -- user sees pipeline flowing in terminal with stage banners, agent spawning indicators, and completion messages. Auto mode does not suppress output.
1109
-
1110
- **On stage failure in auto mode:** STOP PIPELINE ENTIRELY. Report which stage failed and why. No partial PR. User must intervene.
1111
-
1112
- ### Safe vs Risky Checkpoint Classification
1113
-
1114
- See the `<checkpoint_system>` section below for the complete classification table and handling flow.
1115
-
1116
- ### Stale Chain Flag Protection
1117
-
1118
- At orchestrator init, if `--auto` was NOT passed:
1119
- ```bash
1120
- node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
1121
- ```
1122
- This prevents a previous interrupted `--auto` run from causing unexpected auto-advance in a new manual session.
1123
-
1124
- ### Auto Mode Persistence
1125
-
1126
- When `--auto` is passed:
1127
- ```bash
1128
- node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
1129
- ```
1130
- This flag persists across agent spawns within the same pipeline run. Each spawned agent can check it to maintain auto-advance behavior through the chain.
1131
-
1132
- At pipeline completion (success or failure), clear the chain flag:
1133
- ```bash
1134
- node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
1135
- ```
1136
- </auto_advance>
1137
-
1138
- <checkpoint_system>
1139
- ## Checkpoint System
1140
-
1141
- ### Checkpoint Classification
1142
-
1143
- Every agent may return checkpoint data when it encounters a situation requiring human input. The orchestrator classifies each checkpoint as SAFE or RISKY and handles it accordingly.
1144
-
1145
- **SAFE checkpoints (auto-approve in auto mode):**
1146
-
1147
- | Checkpoint | Agent | Why Safe | Auto-Action |
1148
- |------------|-------|----------|-------------|
1149
- | Framework detection uncertain (LOW confidence) | Scanner | Auto-select most likely framework; analysis can continue with reasonable default | Approve with most likely framework |
1150
- | Analyzer assumptions review | Analyzer | Assumptions are informational; incorrect assumptions produce suboptimal but not broken output | Approve all assumptions |
1151
- | TestID audit review | TestID Injector | P0-only injection is conservative; only forms, buttons, and primary actions receive test IDs | Approve P0-only injection |
1152
-
1153
- **RISKY checkpoints (ALWAYS pause, even in auto mode):**
1154
-
1155
- | Checkpoint | Agent | Why Risky | User Action Required |
1156
- |------------|-------|-----------|---------------------|
1157
- | Validator escalation (unresolved issues after 3 fix loops) | Validator | Unresolved issues mean tests may be broken; delivering broken tests defeats the purpose | User decides: approve-with-warnings, abort, or provide fix guidance |
1158
- | APPLICATION BUG classification | Bug Detective | Genuine bugs in application code require developer action, not auto-fix | User reviews bug evidence and decides whether to continue or fix first |
1159
- | Any checkpoint with `blocking` containing "unresolved" or "failed" | Any agent | Indicates pipeline integrity risk; proceeding could produce incorrect artifacts | User reviews the specific blocking issue |
1160
-
1161
- ### Checkpoint Handling Flow
1162
-
1163
- ```
1164
- On agent return with checkpoint data:
1165
- 1. Extract checkpoint `blocking` field content
1166
- 2. Classify as SAFE or RISKY:
1167
- - Match against safe patterns:
1168
- "framework detection" -> SAFE
1169
- "assumptions" -> SAFE
1170
- "audit" or "data-testid" -> SAFE
1171
- - Match against risky patterns:
1172
- "unresolved" -> RISKY
1173
- "failed" -> RISKY
1174
- "APPLICATION BUG" -> RISKY
1175
- - Default (no pattern match) -> RISKY (conservative)
1176
- 3. If IS_AUTO and SAFE:
1177
- - Auto-approve with default action
1178
- - Log: "Auto-approved: {checkpoint_description}"
1179
- - Continue pipeline to next stage
1180
- 4. If IS_AUTO and RISKY:
1181
- - PAUSE pipeline
1182
- - Print checkpoint details with full context:
1183
- - What stage triggered the checkpoint
1184
- - What was completed so far
1185
- - The specific blocking issue
1186
- - What artifacts have been produced
1187
- - Wait for user input
1188
- - On user response: spawn fresh continuation agent
1189
- 5. If NOT auto (manual mode):
1190
- - PAUSE pipeline
1191
- - Print checkpoint details with full context
1192
- - Wait for user input
1193
- - On user response: spawn fresh continuation agent
1194
- ```
1195
-
1196
- ### Resume After Checkpoint
1197
-
1198
- When resuming after a checkpoint, spawn a FRESH agent (not serialized state). This follows the GSD pattern: fresh agent with explicit state is more reliable than serialized continuation.
1199
-
1200
- ```
1201
- Task(
1202
- prompt="
1203
- <objective>Continue QA pipeline from {stage} stage</objective>
1204
- <execution_context>@agents/qa-pipeline-orchestrator.md</execution_context>
1205
- <resume_context>
1206
- Pipeline state:
1207
- - Completed stages: {list of completed stages with their results}
1208
- - Current stage: {stage that triggered checkpoint}
1209
- - Checkpoint response: {user's response or decision}
1210
- - Artifacts produced so far: {list of files with paths}
1211
-
1212
- Resume from: {exact step in pipeline to resume from}
1213
- User decision: {what user chose at checkpoint}
1214
- </resume_context>
1215
- "
1216
- )
1217
- ```
1218
-
1219
- The continuation agent reads this resume_context, verifies the completed stages by checking artifact existence on disk, and continues from the specified point. It does NOT re-execute completed stages.
1220
-
1221
- ### Checkpoint Return Structure
1222
-
1223
- Agents return checkpoints in this structure:
1224
- ```
1225
- CHECKPOINT_RETURN:
1226
- completed: "What has been done so far"
1227
- blocking: "What is blocking progress"
1228
- details: "Detailed context about the blocking issue"
1229
- awaiting: "What the user needs to do or provide"
1230
- ```
1231
-
1232
- The orchestrator parses the `blocking` field to classify the checkpoint.
1233
- </checkpoint_system>
1234
-
1235
- <error_handling>
1236
- ## Error Handling
1237
-
1238
- ### Stage Failure Protocol
1239
-
1240
- When any agent returns a failure or error:
1241
-
1242
- 1. **Set stage status to `failed`:**
1243
- ```bash
1244
- node bin/qaa-tools.cjs state patch --"{Stage} Status" failed --"Status" "Pipeline stopped: {Stage} failed - {reason}"
1245
- ```
1246
-
1247
- 2. **Print failure banner:**
1248
- ```
1249
- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1250
- ! PIPELINE STOPPED !
1251
- ! Stage: {stage_name} !
1252
- ! Reason: {failure_reason} !
1253
- ! !
1254
- ! Completed: {completed_stages} !
1255
- ! Artifacts: {artifacts_so_far} !
1256
- ! !
1257
- ! Action required: Review and re-run !
1258
- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1259
- ```
1260
-
1261
- 3. **DO NOT continue to next stage.** The pipeline stops entirely at the failed stage.
1262
-
1263
- 4. **DO NOT create partial PR.** No branch, no commit, no PR with incomplete results.
1264
-
1265
- 5. **Preserve all artifacts produced so far.** They may be useful for debugging the failure. Artifacts from completed stages remain on disk in `{output_dir}/`.
1266
-
1267
- ### Agent Return Validation
1268
-
1269
- After EVERY agent spawn, before advancing to next stage:
1270
-
1271
- 1. **Check the return for error/stop conditions:**
1272
- - Scanner: Check `decision` field -- if `STOP`, pipeline stops
1273
- - Validator: Check `overall_status` -- if `FAIL` with unresolved issues, checkpoint triggers
1274
- - Bug Detective: Check `classification_breakdown.app_bug` -- if > 0, checkpoint triggers
1275
- - Any agent: Check for error messages, empty returns, or missing expected fields
1276
-
1277
- 2. **Verify expected output artifacts exist on disk:**
1278
- ```bash
1279
- [ -f "{expected_artifact_path}" ] && echo "OK" || echo "MISSING"
1280
- ```
1281
- - Scanner: `{output_dir}/SCAN_MANIFEST.md` must exist
1282
- - Analyzer: `{output_dir}/QA_ANALYSIS.md` (Option 1) or `{output_dir}/GAP_ANALYSIS.md` (Options 2/3) must exist
1283
- - Planner: `{output_dir}/GENERATION_PLAN.md` must exist
1284
- - Executor: All planned test files must exist
1285
- - Validator: `{output_dir}/VALIDATION_REPORT.md` must exist
1286
-
1287
- 3. **If artifacts missing:** Treat as stage failure. Set status to failed and stop pipeline.
1288
-
1289
- ### Retry Policy
1290
-
1291
- The orchestrator does NOT retry failed agents automatically. If a stage fails:
1292
-
1293
- - **In auto mode:** Stop pipeline entirely and report the failure. Print which stage failed, what error occurred, and what artifacts were produced before failure.
1294
- - **In manual mode:** Stop and present the failure to user. User can choose to:
1295
- - Retry the failed stage (orchestrator spawns the same agent again)
1296
- - Abort the pipeline
1297
- - Provide guidance and retry with modifications
1298
- </error_handling>
1299
-
1300
- <pipeline_summary>
1301
- ## Pipeline Summary
1302
-
1303
- After all stages complete (or on pipeline stop), print a summary banner:
1304
-
1305
- ```
1306
- ======================================================
1307
- QA PIPELINE COMPLETE
1308
- ======================================================
1309
-
1310
- Option: {option} ({option_description})
1311
- Repository: {dev_repo_path}
1312
- QA Repo: {qa_repo_path or 'N/A'}
1313
- Maturity Score: {maturity_score or 'N/A'}
1314
-
1315
- Stages Completed:
1316
- [{check}] Scan -- {scan_duration} {scan_extra}
1317
- [{check}] Analyze -- {analyze_duration} ({test_count} test cases)
1318
- [{check}] TestID Inject-- {inject_duration or 'skipped'}
1319
- [{check}] Plan -- {plan_duration} ({file_count} files planned)
1320
- [{check}] Generate -- {generate_duration} ({files_created} files created)
1321
- [{check}] Validate -- {validate_duration} ({confidence} confidence)
1322
- [{check}] Bug Detective-- {detective_duration or 'skipped'}
1323
- [{check}] Deliver -- {deliver_duration}
1324
-
1325
- PR: {pr_url or 'not created (local-only)'}
1326
-
1327
- Artifacts:
1328
- {list all produced .md files in output_dir}
1329
-
1330
- Total Time: {total_duration}
1331
- ======================================================
1332
- ```
1333
-
1334
- Where:
1335
- - `[x]` = stage completed successfully
1336
- - `[ ]` = stage skipped (testid-inject when no frontend, bug-detective when no failures)
1337
- - `[!]` = stage failed
1338
-
1339
- **On pipeline failure:** The summary still prints, but shows which stages completed and which failed, along with the failure reason.
1340
-
1341
- **Artifact list includes:**
1342
- - SCAN_MANIFEST.md (always)
1343
- - QA_ANALYSIS.md (Option 1) or GAP_ANALYSIS.md (Options 2/3)
1344
- - TEST_INVENTORY.md (Option 1)
1345
- - QA_REPO_BLUEPRINT.md (Option 1, if produced)
1346
- - TESTID_AUDIT_REPORT.md (if frontend detected)
1347
- - GENERATION_PLAN.md (if plan stage completed)
1348
- - Generated test files (if generate stage completed)
1349
- - VALIDATION_REPORT.md (if validate stage completed)
1350
- - FAILURE_CLASSIFICATION_REPORT.md (if bug detective ran)
1351
- </pipeline_summary>
1352
-
1353
- <quality_gate>
1354
- ## Quality Gate
1355
-
1356
- Before this orchestrator is considered complete, verify:
1357
-
1358
- - [ ] All 3 workflow options route to correct stage sequences:
1359
- - Option 1: scan(dev) -> analyze(full) -> [testid-inject] -> plan -> generate -> validate -> [bug-detective] -> deliver
1360
- - Option 2: scan(both) -> analyze(gap) -> [testid-inject] -> plan(gap) -> generate(gap) -> validate -> [bug-detective] -> deliver
1361
- - Option 3: scan(both) -> analyze(gap) -> [testid-inject] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective] -> deliver
1362
- - [ ] Every agent spawn is bracketed by state updates (running before, complete/failed after)
1363
- - [ ] Auto-advance correctly classifies safe vs risky checkpoints
1364
- - [ ] Pipeline stops entirely on any stage failure (no partial PR)
1365
- - [ ] Progress banners print for every stage even in auto mode
1366
- - [ ] Deliver stage creates branch, commits per-stage, pushes, and creates draft PR via gh CLI
1367
- - [ ] Resume spawns fresh agent with explicit state (no serialization)
1368
- </quality_gate>
1369
-
1370
- <success_criteria>
1371
- ## Success Criteria
1372
-
1373
- 1. QA engineer can invoke orchestrator and pipeline runs through all stages for their repo type
1374
- 2. Option detection is automatic based on repo count and maturity scoring
1375
- 3. Pipeline state in STATE.md accurately reflects progress at every point
1376
- 4. Checkpoints pause when appropriate and auto-approve when safe
1377
- 5. Failure in any stage stops the pipeline cleanly with actionable error message
1378
- </success_criteria>
1
+ <purpose>
2
+ Single orchestrator for the QA automation pipeline. Coordinates all 7 agent types (scanner, analyzer, planner, executor, validator, bug-detective, testid-injector) across 3 workflow options. Owns all pipeline state transitions -- agents never update state directly. The orchestrator sets stage status to 'running' before spawning an agent and 'complete' or 'failed' after the agent returns.
3
+
4
+ Invoked by the `/qa-start` slash command (Phase 6) or directly via Task() with this file as execution_context. Accepts 0-2 repo paths: 0 paths uses cwd as dev repo (Option 1), 1 path is dev-only (Option 1), 2 paths triggers maturity scoring to determine Option 2 or 3.
5
+
6
+ **Pipeline stages in order:**
7
+ ```
8
+ scan -> codebase-map -> analyze -> [testid-inject if frontend] -> plan -> generate -> validate -> [e2e-runner if E2E tests] -> [bug-detective if failures] -> deliver
9
+ ```
10
+
11
+ **Workflow options:**
12
+ - Option 1: Dev-only repo -- full pipeline from scratch
13
+ - Option 2: Dev + immature QA repo -- gap-fill and standardize
14
+ - Option 3: Dev + mature QA repo -- surgical additions only
15
+ </purpose>
16
+
17
+ <required_reading>
18
+ Read these files BEFORE executing any pipeline stage. Do NOT skip.
19
+
20
+ - **CLAUDE.md** -- Agent pipeline stages, module boundaries, quality gates, stage transitions, auto-advance rules, agent coordination, data-testid convention. Read the full file.
21
+ - **.planning/STATE.md** -- Current pipeline state. Check scan_status, analyze_status, generate_status, validate_status, deliver_status fields.
22
+ - **.planning/config.json** -- Workflow configuration: auto_advance flag, parallelization flag, mode, commit_docs.
23
+ </required_reading>
24
+
25
+ <process>
26
+
27
+ <step name="initialize">
28
+ ## Step 1: Initialize Pipeline
29
+
30
+ Call `qaa-tools.cjs init qa-start` to bootstrap the full workflow context.
31
+
32
+ ```bash
33
+ INIT_JSON=$(node bin/qaa-tools.cjs init qa-start)
34
+ ```
35
+
36
+ Parse the JSON to extract all required fields:
37
+
38
+ ```
39
+ option -- 1, 2, or 3 (workflow routing)
40
+ dev_repo_path -- path to the developer repository
41
+ qa_repo_path -- path to existing QA repository (null for Option 1)
42
+ maturity_score -- 0-100 QA repo quality score (null for Option 1)
43
+ maturity_note -- descriptive note about maturity assessment (null for Option 1)
44
+ output_dir -- ".qa-output" (where agents write artifacts)
45
+ date -- "YYYY-MM-DD" for branch naming and timestamps
46
+
47
+ scanner_model -- model for scanner agent
48
+ analyzer_model -- model for analyzer agent
49
+ planner_model -- model for planner agent
50
+ executor_model -- model for executor agent
51
+ validator_model -- model for validator agent
52
+ detective_model -- model for bug-detective agent
53
+ injector_model -- model for testid-injector agent
54
+
55
+ auto_advance -- persistent config flag (boolean)
56
+ auto_chain_active -- ephemeral chain flag (boolean)
57
+ parallelization -- parallelization config value
58
+ commit_docs -- whether to commit documentation artifacts
59
+ ```
60
+
61
+ **Determine auto-advance mode:**
62
+
63
+ ```bash
64
+ IS_AUTO=false
65
+
66
+ # Check persistent config flag
67
+ if auto_advance is true OR auto_chain_active is true; then
68
+ IS_AUTO=true
69
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
70
+ fi
71
+
72
+ # Check if --auto was passed as argument to orchestrator invocation
73
+ if --auto flag was passed; then
74
+ IS_AUTO=true
75
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
76
+ fi
77
+ ```
78
+
79
+ **Safety: Clear stale auto-chain flag** -- if NOT in auto mode, clear the ephemeral flag to prevent a previous interrupted `--auto` run from causing unexpected auto-advance:
80
+
81
+ ```bash
82
+ if IS_AUTO is false:
83
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
84
+ ```
85
+
86
+ **Print initialization banner:**
87
+
88
+ ```
89
+ === QA Pipeline Orchestrator ===
90
+ Option: {option} ({description})
91
+ Dev Repo: {dev_repo_path}
92
+ QA Repo: {qa_repo_path or 'N/A'}
93
+ Maturity Score: {maturity_score or 'N/A'}
94
+ Auto-Advance: {IS_AUTO}
95
+ Date: {date}
96
+ ================================
97
+ ```
98
+
99
+ Where `{description}` is:
100
+ - Option 1: "Dev-Only -- Full Pipeline"
101
+ - Option 2: "Dev + Immature QA -- Gap-Fill"
102
+ - Option 3: "Dev + Mature QA -- Surgical"
103
+ </step>
104
+
105
+ <step name="route_by_option">
106
+ ## Step 2: Route by Option
107
+
108
+ Based on `option` value from init, select the stage sequence. Each option shares the same core pipeline but differs in how agents are parameterized and what artifacts they produce.
109
+
110
+ **Option 1 stages:**
111
+ ```
112
+ scan(dev) -> codebase-map -> analyze(full) -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
113
+ ```
114
+ - Scanner: scan DEV repo only
115
+ - Codebase Map: deep-scan codebase for testability, risk, patterns, existing tests
116
+ - Analyzer: mode='full' (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
117
+ - Planner: reads TEST_INVENTORY.md + QA_ANALYSIS.md + codebase map documents
118
+ - Executor: generates all planned test files using codebase map for context
119
+ - All stages run against DEV repo artifacts
120
+
121
+ **Option 2 stages:**
122
+ ```
123
+ scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(gap) -> validate -> [bug-detective if failures] -> deliver
124
+ ```
125
+ - Scanner: scan BOTH dev_repo_path and qa_repo_path
126
+ - Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
127
+ - Analyzer: mode='gap' (produces GAP_ANALYSIS.md)
128
+ - Planner: reads GAP_ANALYSIS.md + codebase map documents (fix broken first, then add missing, then standardize)
129
+ - Executor: generates fixed test files + new test files + standardized files
130
+ - All stages aware of existing QA repo structure
131
+
132
+ **Option 3 stages:**
133
+ ```
134
+ scan(both) -> codebase-map -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective if failures] -> deliver
135
+ ```
136
+ - Scanner: scan BOTH dev_repo_path and qa_repo_path
137
+ - Codebase Map: deep-scan dev codebase for testability, risk, patterns, existing tests
138
+ - Analyzer: mode='gap' (produces GAP_ANALYSIS.md with thin areas only)
139
+ - Planner: reads GAP_ANALYSIS.md + codebase map documents (missing tests only)
140
+ - Executor: passes `skip_existing_test_ids: true` so it checks existing test files by test ID before generating -- skips tests that already exist
141
+ - Only new test files are generated; existing working tests are left untouched
142
+
143
+ **Shared stages across all options:**
144
+ - TestID injection: conditional on `has_frontend` from scanner return
145
+ - Validation: always runs on generated files
146
+ - Bug detective: conditional on test failures
147
+ - Deliver: always runs (branch creation, per-stage commits, push, draft PR via gh CLI)
148
+ </step>
149
+
150
+ <step name="execute_scan">
151
+ ## Step 3: Execute Scan Stage
152
+
153
+ **State update -- mark scan as running:**
154
+ ```bash
155
+ node bin/qaa-tools.cjs state patch --"Scan Status" running --"Status" "Scanning repository"
156
+ ```
157
+
158
+ **Print stage banner:**
159
+ ```
160
+ +------------------------------------------+
161
+ | STAGE 1: Scanner |
162
+ | Status: Running... |
163
+ +------------------------------------------+
164
+ ```
165
+
166
+ **Spawn scanner agent via Task():**
167
+
168
+ For **Option 1** -- scan DEV repo only:
169
+ ```
170
+ Task(
171
+ prompt="
172
+ <objective>Scan repository and produce SCAN_MANIFEST.md</objective>
173
+ <execution_context>@agents/qaa-scanner.md</execution_context>
174
+ <files_to_read>
175
+ - CLAUDE.md
176
+ </files_to_read>
177
+ <parameters>
178
+ dev_repo_path: {dev_repo_path}
179
+ qa_repo_path: null
180
+ output_path: {output_dir}/SCAN_MANIFEST.md
181
+ </parameters>
182
+ "
183
+ )
184
+ ```
185
+
186
+ For **Options 2 and 3** -- scan BOTH repos:
187
+ ```
188
+ Task(
189
+ prompt="
190
+ <objective>Scan both developer and QA repositories and produce SCAN_MANIFEST.md</objective>
191
+ <execution_context>@agents/qaa-scanner.md</execution_context>
192
+ <files_to_read>
193
+ - CLAUDE.md
194
+ </files_to_read>
195
+ <parameters>
196
+ dev_repo_path: {dev_repo_path}
197
+ qa_repo_path: {qa_repo_path}
198
+ output_path: {output_dir}/SCAN_MANIFEST.md
199
+ </parameters>
200
+ "
201
+ )
202
+ ```
203
+
204
+ **Parse scanner return:**
205
+
206
+ Expected return structure:
207
+ ```
208
+ SCANNER_COMPLETE:
209
+ file_path: ".qa-output/SCAN_MANIFEST.md"
210
+ decision: PROCEED | STOP
211
+ has_frontend: true | false
212
+ detection_confidence: HIGH | MEDIUM | LOW
213
+ ```
214
+
215
+ **Handle decision field:**
216
+
217
+ - If `decision` is `STOP`:
218
+ ```bash
219
+ node bin/qaa-tools.cjs state patch --"Scan Status" failed --"Status" "Pipeline stopped: Scanner returned STOP"
220
+ ```
221
+ Print failure banner and STOP PIPELINE ENTIRELY. Do NOT proceed to any further stage.
222
+
223
+ - If `decision` is `PROCEED`:
224
+ ```bash
225
+ node bin/qaa-tools.cjs state patch --"Scan Status" complete
226
+ ```
227
+ Capture `has_frontend` for testid-injector conditional.
228
+ Capture `detection_confidence` for checkpoint handling.
229
+
230
+ **Handle scanner checkpoint -- framework detection uncertain:**
231
+
232
+ If `detection_confidence` is `LOW`:
233
+ - If `IS_AUTO` is true: Auto-approve with most likely framework (SAFE checkpoint). Log: "Auto-approved: Scanner framework detection (LOW confidence, selected most likely framework)". Continue pipeline.
234
+ - If `IS_AUTO` is false: Present the detection details to user. Wait for confirmation. On user response, spawn fresh continuation agent with user's framework choice.
235
+ </step>
236
+
237
+ <step name="execute_codebase_map">
238
+ ## Step 4: Execute Codebase Map Stage
239
+
240
+ Deep-scan the codebase to produce QA-oriented documents that all downstream agents consume. This gives the pipeline full knowledge of the codebase structure, testability, risk areas, code patterns, and existing test coverage before any analysis or generation happens.
241
+
242
+ **State update -- mark codebase-map as running:**
243
+ ```bash
244
+ node bin/qaa-tools.cjs state patch --"Map Status" running --"Status" "Mapping codebase"
245
+ ```
246
+
247
+ **Print stage banner:**
248
+ ```
249
+ +------------------------------------------+
250
+ | STAGE 2: Codebase Map |
251
+ | Status: Running... |
252
+ +------------------------------------------+
253
+ ```
254
+
255
+ **Spawn 4 mapper agents in parallel (one per focus area):**
256
+
257
+ ```
258
+ For each focus in [testability, risk, patterns, existing-tests]:
259
+
260
+ Task(
261
+ prompt="
262
+ <objective>Analyze codebase for QA purposes. Focus area: {focus}. Write documents to {output_dir}/codebase/.</objective>
263
+ <execution_context>@agents/qaa-codebase-mapper.md</execution_context>
264
+ <files_to_read>
265
+ - CLAUDE.md
266
+ - {output_dir}/SCAN_MANIFEST.md
267
+ </files_to_read>
268
+ <parameters>
269
+ focus: {focus}
270
+ dev_repo_path: {dev_repo_path}
271
+ output_dir: {output_dir}/codebase
272
+ </parameters>
273
+ "
274
+ )
275
+ ```
276
+
277
+ All 4 agents can run simultaneously -- they read the codebase but write to separate files.
278
+
279
+ **Parse mapper returns and verify outputs exist:**
280
+
281
+ Expected files after all 4 complete:
282
+ ```
283
+ {output_dir}/codebase/TESTABILITY.md
284
+ {output_dir}/codebase/TEST_SURFACE.md
285
+ {output_dir}/codebase/RISK_MAP.md
286
+ {output_dir}/codebase/CRITICAL_PATHS.md
287
+ {output_dir}/codebase/CODE_PATTERNS.md
288
+ {output_dir}/codebase/API_CONTRACTS.md
289
+ {output_dir}/codebase/TEST_ASSESSMENT.md
290
+ {output_dir}/codebase/COVERAGE_GAPS.md
291
+ ```
292
+
293
+ Verify at least 4 of 8 files exist (one per focus area produces 2 files). If any focus area produced 0 files, log warning but continue -- the downstream agents treat these as optional reads.
294
+
295
+ **State update -- mark codebase-map as complete:**
296
+ ```bash
297
+ node bin/qaa-tools.cjs state patch --"Map Status" complete --"Status" "Codebase mapped"
298
+ ```
299
+
300
+ Print: "Codebase map complete. {N} documents produced in {output_dir}/codebase/."
301
+
302
+ **Set codebase_map_dir for downstream stages:**
303
+ ```
304
+ codebase_map_dir = "{output_dir}/codebase"
305
+ ```
306
+ </step>
307
+
308
+ <step name="execute_analyze">
309
+ ## Step 5: Execute Analyze Stage
310
+
311
+ **State update -- mark analyze as running:**
312
+ ```bash
313
+ node bin/qaa-tools.cjs state patch --"Analyze Status" running --"Status" "Analyzing repository"
314
+ ```
315
+
316
+ **Print stage banner:**
317
+ ```
318
+ +------------------------------------------+
319
+ | STAGE 2: Analyzer |
320
+ | Status: Running... |
321
+ +------------------------------------------+
322
+ ```
323
+
324
+ **Determine analyzer mode based on option:**
325
+ - Option 1: `mode = 'full'` (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
326
+ - Options 2 and 3: `mode = 'gap'` (produces GAP_ANALYSIS.md)
327
+
328
+ **Spawn analyzer agent via Task():**
329
+ ```
330
+ Task(
331
+ prompt="
332
+ <objective>Analyze scanned repository and produce analysis artifacts</objective>
333
+ <execution_context>@agents/qaa-analyzer.md</execution_context>
334
+ <files_to_read>
335
+ - {output_dir}/SCAN_MANIFEST.md
336
+ - CLAUDE.md
337
+ - {codebase_map_dir}/RISK_MAP.md (if exists)
338
+ - {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
339
+ - {codebase_map_dir}/TEST_ASSESSMENT.md (if exists)
340
+ - {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
341
+ </files_to_read>
342
+ <parameters>
343
+ mode: {mode}
344
+ workflow_option: {option}
345
+ dev_repo_path: {dev_repo_path}
346
+ qa_repo_path: {qa_repo_path or null}
347
+ output_path: {output_dir}/
348
+ codebase_map_dir: {codebase_map_dir}
349
+ </parameters>
350
+ "
351
+ )
352
+ ```
353
+
354
+ **Parse analyzer return:**
355
+
356
+ Expected return structure:
357
+ ```
358
+ ANALYZER_COMPLETE:
359
+ files_produced: [...]
360
+ total_test_count: N
361
+ pyramid_breakdown: {unit: N, integration: N, api: N, e2e: N}
362
+ risk_count: {high: N, medium: N, low: N}
363
+ commit_hash: "..."
364
+ ```
365
+
366
+ Capture `files_produced`, `total_test_count`, `pyramid_breakdown` for downstream stages.
367
+
368
+ **Handle analyzer checkpoint -- assumptions review:**
369
+
370
+ If the analyzer returns a checkpoint with assumptions:
371
+ - If `IS_AUTO` is true: Auto-approve all assumptions (SAFE checkpoint). Log: "Auto-approved: Analyzer assumptions". Continue pipeline.
372
+ - If `IS_AUTO` is false: Present assumptions to user for review. Wait for confirmation or corrections. On user response, spawn fresh continuation agent incorporating any corrections.
373
+
374
+ **State update -- mark analyze as complete:**
375
+ ```bash
376
+ node bin/qaa-tools.cjs state patch --"Analyze Status" complete
377
+ ```
378
+
379
+ Print completion message: "Analysis complete. {total_test_count} test cases identified. Pyramid: {pyramid_breakdown}."
380
+ </step>
381
+
382
+ <step name="execute_testid_inject">
383
+ ## Step 5: Execute TestID Injection Stage (Conditional)
384
+
385
+ **Condition:** Only execute if `has_frontend` is `true` from scanner return (Step 3).
386
+
387
+ **If `has_frontend` is false:**
388
+ Print: "Skipping TestID injection (no frontend detected)." and proceed directly to Step 6 (Plan).
389
+
390
+ **If `has_frontend` is true:**
391
+
392
+ **State update:**
393
+ ```bash
394
+ node bin/qaa-tools.cjs state patch --"Status" "Injecting test IDs into frontend components"
395
+ ```
396
+
397
+ **Print stage banner:**
398
+ ```
399
+ +------------------------------------------+
400
+ | STAGE 3: TestID Injector |
401
+ | Status: Running... |
402
+ +------------------------------------------+
403
+ ```
404
+
405
+ **Spawn testid-injector agent via Task():**
406
+ ```
407
+ Task(
408
+ prompt="
409
+ <objective>Audit and inject data-testid attributes into frontend components</objective>
410
+ <execution_context>@agents/qaa-testid-injector.md</execution_context>
411
+ <files_to_read>
412
+ - {output_dir}/SCAN_MANIFEST.md
413
+ - CLAUDE.md
414
+ </files_to_read>
415
+ <parameters>
416
+ dev_repo_path: {dev_repo_path}
417
+ output_path: {output_dir}/TESTID_AUDIT_REPORT.md
418
+ </parameters>
419
+ "
420
+ )
421
+ ```
422
+
423
+ **Parse return:**
424
+
425
+ Check for `INJECTOR_COMPLETE` vs `INJECTOR_SKIPPED`:
426
+
427
+ If `INJECTOR_COMPLETE`:
428
+ ```
429
+ INJECTOR_COMPLETE:
430
+ report_path: "..."
431
+ coverage_before: N%
432
+ coverage_after: N%
433
+ elements_injected: N
434
+ ...
435
+ ```
436
+ Log: "TestID injection complete. Coverage: {coverage_before}% -> {coverage_after}%. {elements_injected} elements injected."
437
+
438
+ If `INJECTOR_SKIPPED`:
439
+ ```
440
+ INJECTOR_SKIPPED:
441
+ reason: "..."
442
+ action: "..."
443
+ ```
444
+ Log the reason and continue pipeline.
445
+
446
+ **Handle injector checkpoint -- audit review:**
447
+ - If `IS_AUTO` is true: Auto-approve P0-only injection (SAFE checkpoint). Log: "Auto-approved: TestID injection (P0 elements only)". Continue pipeline.
448
+ - If `IS_AUTO` is false: Present audit report to user. Wait for approval, element selection, or rejection. On user response, spawn fresh continuation agent with user's approved elements.
449
+ </step>
450
+
451
+ <step name="execute_plan">
452
+ ## Step 6: Execute Plan Stage
453
+
454
+ **State update -- mark generation as running (planning is part of generate):**
455
+ ```bash
456
+ node bin/qaa-tools.cjs state patch --"Generate Status" running --"Status" "Planning test generation"
457
+ ```
458
+
459
+ **Print stage banner:**
460
+ ```
461
+ +------------------------------------------+
462
+ | STAGE 4: Planner |
463
+ | Status: Running... |
464
+ +------------------------------------------+
465
+ ```
466
+
467
+ **Determine planner input based on option:**
468
+ - Option 1: Input from `{output_dir}/TEST_INVENTORY.md` + `{output_dir}/QA_ANALYSIS.md`
469
+ - Options 2 and 3: Input from `{output_dir}/GAP_ANALYSIS.md`
470
+
471
+ **Spawn planner agent via Task():**
472
+ ```
473
+ Task(
474
+ prompt="
475
+ <objective>Create test generation plan with task breakdown and dependencies</objective>
476
+ <execution_context>@agents/qaa-planner.md</execution_context>
477
+ <files_to_read>
478
+ - {input files based on option -- see above}
479
+ - CLAUDE.md
480
+ - {codebase_map_dir}/TESTABILITY.md (if exists)
481
+ - {codebase_map_dir}/TEST_SURFACE.md (if exists)
482
+ - {codebase_map_dir}/CRITICAL_PATHS.md (if exists)
483
+ - {codebase_map_dir}/COVERAGE_GAPS.md (if exists)
484
+ </files_to_read>
485
+ <parameters>
486
+ workflow_option: {option}
487
+ output_path: {output_dir}/GENERATION_PLAN.md
488
+ codebase_map_dir: {codebase_map_dir}
489
+ </parameters>
490
+ "
491
+ )
492
+ ```
493
+
494
+ **Parse planner return:**
495
+
496
+ Expected return structure:
497
+ ```
498
+ PLANNER_COMPLETE:
499
+ file_path: "..."
500
+ total_tasks: N
501
+ total_files: N
502
+ feature_count: N
503
+ dependency_depth: N
504
+ test_case_count: N
505
+ commit_hash: "..."
506
+ ```
507
+
508
+ Capture `total_tasks`, `total_files`, `feature_count` for executor stage and pipeline summary.
509
+
510
+ Print: "Plan complete. {total_tasks} tasks, {total_files} files planned across {feature_count} features."
511
+ </step>
512
+
513
+ <step name="execute_generate">
514
+ ## Step 7: Execute Generate Stage (FLOW-04 -- Wave-based Parallel Execution)
515
+
516
+ State update continues from planning (already set to `running` in Step 6).
517
+
518
+ **Print stage banner:**
519
+ ```
520
+ +------------------------------------------+
521
+ | STAGE 5: Executor |
522
+ | Generating {total_files} test files |
523
+ | Status: Running... |
524
+ +------------------------------------------+
525
+ ```
526
+
527
+ **FLOW-04: Wave-based parallel execution:**
528
+
529
+ Check if planner created multiple independent feature groups. If `feature_count > 1` AND `parallelization` config allows parallel execution:
530
+
531
+ **Parallel execution (when feature_count > 1 and parallelization enabled):**
532
+
533
+ For each independent feature group from the generation plan, spawn a separate executor agent:
534
+ ```
535
+ Task(
536
+ prompt="
537
+ <objective>Generate test files for {feature} feature</objective>
538
+ <execution_context>@agents/qaa-executor.md</execution_context>
539
+ <files_to_read>
540
+ - {output_dir}/GENERATION_PLAN.md
541
+ - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
542
+ - CLAUDE.md
543
+ - {codebase_map_dir}/CODE_PATTERNS.md (if exists)
544
+ - {codebase_map_dir}/API_CONTRACTS.md (if exists)
545
+ - {codebase_map_dir}/TEST_SURFACE.md (if exists)
546
+ </files_to_read>
547
+ <parameters>
548
+ workflow_option: {option}
549
+ feature_group: {feature}
550
+ dev_repo_path: {dev_repo_path}
551
+ qa_repo_path: {qa_repo_path or null}
552
+ output_path: {output_dir}/
553
+ codebase_map_dir: {codebase_map_dir}
554
+ </parameters>
555
+ "
556
+ )
557
+ ```
558
+
559
+ Multiple Task() calls can be issued simultaneously for independent feature groups. Each executor handles one feature group and commits its files independently.
560
+
561
+ **Sequential execution (when feature_count == 1 or parallelization disabled):**
562
+
563
+ Spawn a single executor agent covering all tasks:
564
+ ```
565
+ Task(
566
+ prompt="
567
+ <objective>Generate all test files from generation plan</objective>
568
+ <execution_context>@agents/qaa-executor.md</execution_context>
569
+ <files_to_read>
570
+ - {output_dir}/GENERATION_PLAN.md
571
+ - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
572
+ - CLAUDE.md
573
+ - {codebase_map_dir}/CODE_PATTERNS.md (if exists)
574
+ - {codebase_map_dir}/API_CONTRACTS.md (if exists)
575
+ - {codebase_map_dir}/TEST_SURFACE.md (if exists)
576
+ </files_to_read>
577
+ <parameters>
578
+ workflow_option: {option}
579
+ dev_repo_path: {dev_repo_path}
580
+ qa_repo_path: {qa_repo_path or null}
581
+ output_path: {output_dir}/
582
+ codebase_map_dir: {codebase_map_dir}
583
+ </parameters>
584
+ "
585
+ )
586
+ ```
587
+
588
+ **Option 3 specific -- skip existing tests:**
589
+
590
+ For Option 3, pass `skip_existing_test_ids: true` to the executor so it checks existing test files by test ID before generating. If a test ID already exists in the QA repo, skip generating that test case.
591
+
592
+ ```
593
+ <parameters>
594
+ workflow_option: 3
595
+ skip_existing_test_ids: true
596
+ dev_repo_path: {dev_repo_path}
597
+ qa_repo_path: {qa_repo_path}
598
+ output_path: {output_dir}/
599
+ </parameters>
600
+ ```
601
+
602
+ **Parse executor return:**
603
+
604
+ Expected return structure:
605
+ ```
606
+ EXECUTOR_COMPLETE:
607
+ files_created: [{path, type}, ...]
608
+ total_files: N
609
+ commit_count: N
610
+ features_covered: [...]
611
+ test_case_count: N
612
+ ```
613
+
614
+ Capture `files_created`, `total_files`, `commit_count` for validation stage and pipeline summary.
615
+
616
+ **State update -- mark generate as complete:**
617
+ ```bash
618
+ node bin/qaa-tools.cjs state patch --"Generate Status" complete --"Status" "Test generation complete"
619
+ ```
620
+
621
+ Print: "Generation complete. {total_files} files created across {features_covered.length} features. {commit_count} commits."
622
+ </step>
623
+
624
+ <step name="execute_validate">
625
+ ## Step 8: Execute Validate Stage
626
+
627
+ **State update -- mark validate as running:**
628
+ ```bash
629
+ node bin/qaa-tools.cjs state patch --"Validate Status" running --"Status" "Validating generated tests"
630
+ ```
631
+
632
+ **Print stage banner:**
633
+ ```
634
+ +------------------------------------------+
635
+ | STAGE 6: Validator |
636
+ | Validating {total_files} test files |
637
+ | Status: Running... |
638
+ +------------------------------------------+
639
+ ```
640
+
641
+ **Spawn validator agent via Task():**
642
+ ```
643
+ Task(
644
+ prompt="
645
+ <objective>Run 4-layer validation on all generated test files</objective>
646
+ <execution_context>@agents/qaa-validator.md</execution_context>
647
+ <files_to_read>
648
+ - {list all generated test files from executor return -- files_created paths}
649
+ - {output_dir}/GENERATION_PLAN.md
650
+ - CLAUDE.md
651
+ </files_to_read>
652
+ <parameters>
653
+ mode: validation
654
+ output_path: {output_dir}/VALIDATION_REPORT.md
655
+ </parameters>
656
+ "
657
+ )
658
+ ```
659
+
660
+ **Parse validator return:**
661
+
662
+ Expected return structure:
663
+ ```
664
+ VALIDATOR_COMPLETE:
665
+ report_path: "..."
666
+ overall_status: PASS | PASS_WITH_WARNINGS | FAIL
667
+ confidence: HIGH | MEDIUM | LOW
668
+ layers_summary: {syntax, structure, dependencies, logic}
669
+ fix_loops_used: N
670
+ issues_found: N
671
+ issues_fixed: N
672
+ unresolved_count: N
673
+ ```
674
+
675
+ **RISKY CHECKPOINT -- Validator escalation (FLOW-07):**
676
+
677
+ If `unresolved_count > 0` after max fix loops (3):
678
+ - **ALWAYS pause, even in auto mode** (this is a locked decision from CONTEXT.md)
679
+ - Present unresolved issues to user with full details from VALIDATION_REPORT.md
680
+ - Wait for user decision:
681
+ - `"approve-with-warnings"`: Accept the validation with warnings. Set Validate Status to complete. Continue to deliver.
682
+ - `"abort"`: Set Validate Status to failed. STOP PIPELINE ENTIRELY.
683
+ - Manual guidance: User provides specific fix instructions. Spawn fresh continuation agent to apply fixes and re-validate.
684
+
685
+ If `overall_status` is `PASS` or `PASS_WITH_WARNINGS` (and unresolved_count is 0):
686
+ ```bash
687
+ node bin/qaa-tools.cjs state patch --"Validate Status" complete --"Status" "Validation passed"
688
+ ```
689
+
690
+ Print: "Validation complete. Status: {overall_status}. Confidence: {confidence}. {issues_found} issues found, {issues_fixed} fixed, {unresolved_count} unresolved."
691
+ </step>
692
+
693
+ <step name="execute_e2e_runner">
694
+ ## Step 9: Execute E2E Runner Stage (Conditional)
695
+
696
+ **Condition:** Only execute if E2E test files were generated. Check if `files_created` from executor return contains any `*.e2e.spec.*` or `*.e2e.cy.*` files. Also requires a live application URL.
697
+
698
+ If no E2E files were generated:
699
+ Print: "Skipping E2E Runner (no E2E test files generated)." and proceed to Step 10.
700
+
701
+ **State update:**
702
+ ```bash
703
+ node bin/qaa-tools.cjs state patch --"Status" "Running E2E tests against live app"
704
+ ```
705
+
706
+ **Print stage banner:**
707
+ ```
708
+ +------------------------------------------+
709
+ | STAGE 7: E2E Runner |
710
+ | Running tests against live application |
711
+ | Status: Running... |
712
+ +------------------------------------------+
713
+ ```
714
+
715
+ **Spawn e2e-runner agent via Task():**
716
+ ```
717
+ Task(
718
+ prompt="
719
+ <objective>Run E2E tests against live application, capture real locators, fix mismatches, loop until pass</objective>
720
+ <execution_context>@agents/qaa-e2e-runner.md</execution_context>
721
+ <files_to_read>
722
+ - CLAUDE.md
723
+ - {e2e_test_files from executor return}
724
+ - {pom_files from executor return}
725
+ </files_to_read>
726
+ <parameters>
727
+ app_url: {app_url if provided, otherwise auto-detect}
728
+ output_dir: {output_dir}
729
+ dev_repo_path: {dev_repo_path}
730
+ </parameters>
731
+ "
732
+ )
733
+ ```
734
+
735
+ **Parse e2e-runner return:**
736
+
737
+ Expected return structure:
738
+ ```
739
+ E2E_RUNNER_COMPLETE:
740
+ app_url: "..."
741
+ total_tests: N
742
+ passed: N
743
+ failed: N
744
+ locator_fixes: N
745
+ app_bugs_found: N
746
+ fix_loops_used: N
747
+ report_path: "..."
748
+ screenshots: [...]
749
+ ```
750
+
751
+ Capture `passed`, `failed`, `app_bugs_found`, `locator_fixes` for pipeline summary.
752
+
753
+ If `app_bugs_found > 0`:
754
+ Print: "E2E Runner found {app_bugs_found} application bug(s). See E2E_RUN_REPORT.md for details with screenshots."
755
+
756
+ If `failed > 0` and `app_bugs_found == 0`:
757
+ Print: "E2E Runner: {failed} test(s) still failing after {fix_loops_used} fix loops. Proceeding to Bug Detective for classification."
758
+
759
+ Print: "E2E run complete. {passed}/{total_tests} passed. {locator_fixes} locators fixed. {app_bugs_found} app bugs found."
760
+ </step>
761
+
762
+ <step name="execute_bug_detective">
763
+ ## Step 10: Execute Bug Detective Stage (Conditional)
764
+
765
+ **Condition:** Only execute if the validator reports test failures. Check:
766
+ - `overall_status === 'FAIL'` in validator return, OR
767
+ - Generated tests have runtime failures that need classification
768
+
769
+ If the validator reports `PASS` or `PASS_WITH_WARNINGS` and there are no test execution failures, skip this stage entirely.
770
+
771
+ **If no failures to classify:**
772
+ Print: "Skipping Bug Detective (no test failures detected)." and proceed directly to Step 10 (Deliver).
773
+
774
+ **If failures need classification:**
775
+
776
+ **State update:**
777
+ ```bash
778
+ node bin/qaa-tools.cjs state patch --"Status" "Classifying test failures"
779
+ ```
780
+
781
+ **Print stage banner:**
782
+ ```
783
+ +------------------------------------------+
784
+ | STAGE 7: Bug Detective |
785
+ | Status: Running... |
786
+ +------------------------------------------+
787
+ ```
788
+
789
+ **Spawn bug-detective agent via Task():**
790
+ ```
791
+ Task(
792
+ prompt="
793
+ <objective>Classify test failures and attempt auto-fixes for test errors</objective>
794
+ <execution_context>@agents/qaa-bug-detective.md</execution_context>
795
+ <files_to_read>
796
+ - {test execution results -- from validator or direct test run}
797
+ - {failing test source files -- paths from executor return}
798
+ - CLAUDE.md
799
+ </files_to_read>
800
+ <parameters>
801
+ output_path: {output_dir}/FAILURE_CLASSIFICATION_REPORT.md
802
+ </parameters>
803
+ "
804
+ )
805
+ ```
806
+
807
+ **Parse bug-detective return:**
808
+
809
+ Expected return structure:
810
+ ```
811
+ DETECTIVE_COMPLETE:
812
+ report_path: "..."
813
+ total_failures: N
814
+ classification_breakdown: {app_bug: N, test_error: N, env_issue: N, inconclusive: N}
815
+ auto_fixes_applied: N
816
+ auto_fixes_verified: N
817
+ commit_hash: "..."
818
+ ```
819
+
820
+ **RISKY CHECKPOINT -- Application bugs detected:**
821
+
822
+ If `classification_breakdown.app_bug > 0`:
823
+ - **ALWAYS pause, even in auto mode** (locked decision -- application bugs require developer action)
824
+ - Present APPLICATION BUG classifications to user with full evidence from FAILURE_CLASSIFICATION_REPORT.md
825
+ - These are genuine bugs in the application code discovered during test execution
826
+ - The bug detective never touches application code -- it only reports
827
+ - User must review and decide how to proceed:
828
+ - Acknowledge bugs and continue pipeline (bugs will be in the PR description for developer attention)
829
+ - Abort pipeline to fix bugs first
830
+
831
+ Print: "Bug Detective complete. {total_failures} failures classified: {app_bug} APP BUG, {test_error} TEST ERROR, {env_issue} ENV ISSUE, {inconclusive} INCONCLUSIVE. {auto_fixes_applied} auto-fixes applied."
832
+ </step>
833
+
834
+ <step name="execute_deliver">
835
+ ## Step 10: Execute Deliver Stage
836
+
837
+ **State update -- mark deliver as running:**
838
+ ```bash
839
+ node bin/qaa-tools.cjs state patch --"Deliver Status" running --"Status" "Preparing delivery"
840
+ ```
841
+
842
+ **Print stage banner:**
843
+ ```
844
+ +------------------------------------------+
845
+ | STAGE 8: Deliver |
846
+ | Status: Running... |
847
+ +------------------------------------------+
848
+ ```
849
+
850
+ ### Sub-step 1: Pre-flight checks
851
+
852
+ Before attempting branch creation or PR, verify that the required tools are available and authenticated.
853
+
854
+ **Check for git remote:**
855
+ ```bash
856
+ REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "")
857
+ ```
858
+
859
+ If `REMOTE_URL` is empty:
860
+ - Print error: `"No git remote found. Artifacts committed locally but PR creation skipped."`
861
+ - Set `LOCAL_ONLY=true`
862
+ - Skip Sub-steps 6, 7, 8, 9 (push and PR creation). Still execute Sub-steps 2-5 (branch creation and commits) so artifacts are organized on a local branch.
863
+
864
+ **Check for gh CLI authentication:**
865
+ ```bash
866
+ gh auth status 2>/dev/null
867
+ ```
868
+
869
+ If `gh auth status` fails (non-zero exit code):
870
+ - Print error: `"gh CLI not authenticated. Run 'gh auth login' first. Artifacts committed locally."`
871
+ - Set `LOCAL_ONLY=true`
872
+ - Same skip behavior as above.
873
+
874
+ If both checks pass, set `LOCAL_ONLY=false`.
875
+
876
+ ### Sub-step 2: Derive project name
877
+
878
+ Read `dev_repo_path` from init context (available from Step 1).
879
+
880
+ **Attempt to read from package.json:**
881
+ ```bash
882
+ PROJECT_NAME=$(node -e "try { const p = require('${dev_repo_path}/package.json'); console.log(p.name || ''); } catch { console.log(''); }")
883
+ ```
884
+
885
+ **Fallback to directory basename:**
886
+ ```bash
887
+ if [ -z "$PROJECT_NAME" ]; then
888
+ PROJECT_NAME=$(basename "${dev_repo_path}")
889
+ fi
890
+ ```
891
+
892
+ **Sanitize for branch naming (lowercase, alphanumeric and hyphens only):**
893
+ ```bash
894
+ PROJECT_NAME=$(echo "$PROJECT_NAME" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//' | sed 's/-$//')
895
+ ```
896
+
897
+ ### Sub-step 3: Detect default branch
898
+
899
+ ```bash
900
+ DEFAULT_BRANCH=$(gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name' 2>/dev/null || echo "main")
901
+ ```
902
+
903
+ If `gh repo view` fails (e.g., no remote or gh not authenticated), fall back to `"main"`.
904
+
905
+ ### Sub-step 4: Create feature branch
906
+
907
+ **Branch name:** `qa/auto-{PROJECT_NAME}-{date}` where `date` comes from init context.
908
+
909
+ **Handle collision:** If branch already exists locally or remotely, append a numeric suffix:
910
+ ```bash
911
+ BRANCH="qa/auto-${PROJECT_NAME}-${date}"
912
+
913
+ # Check if branch exists locally or remotely
914
+ if git rev-parse --verify "$BRANCH" 2>/dev/null || git rev-parse --verify "origin/$BRANCH" 2>/dev/null; then
915
+ SUFFIX=2
916
+ while git rev-parse --verify "${BRANCH}-${SUFFIX}" 2>/dev/null || git rev-parse --verify "origin/${BRANCH}-${SUFFIX}" 2>/dev/null; do
917
+ SUFFIX=$((SUFFIX + 1))
918
+ done
919
+ BRANCH="${BRANCH}-${SUFFIX}"
920
+ fi
921
+ ```
922
+
923
+ **Create from default branch:**
924
+ ```bash
925
+ git checkout -b "$BRANCH" "$DEFAULT_BRANCH"
926
+ ```
927
+
928
+ If branch creation fails, print error and fall through to local-only commit on current branch.
929
+
930
+ ### Sub-step 5: Per-stage atomic commits
931
+
932
+ For each pipeline stage that produced artifacts, commit using `qaa-tools.cjs commit`. Check file existence before each commit -- skip stages that did not produce artifacts.
933
+
934
+ **Scanner:**
935
+ ```bash
936
+ if [ -f "${output_dir}/SCAN_MANIFEST.md" ]; then
937
+ node bin/qaa-tools.cjs commit "qa(scanner): produce SCAN_MANIFEST.md for ${PROJECT_NAME}" --files ${output_dir}/SCAN_MANIFEST.md
938
+ fi
939
+ ```
940
+
941
+ **Analyzer (Option 1):**
942
+ ```bash
943
+ if [ -f "${output_dir}/QA_ANALYSIS.md" ]; then
944
+ ANALYZER_FILES="${output_dir}/QA_ANALYSIS.md ${output_dir}/TEST_INVENTORY.md"
945
+ if [ -f "${output_dir}/QA_REPO_BLUEPRINT.md" ]; then
946
+ ANALYZER_FILES="${ANALYZER_FILES} ${output_dir}/QA_REPO_BLUEPRINT.md"
947
+ fi
948
+ node bin/qaa-tools.cjs commit "qa(analyzer): produce QA_ANALYSIS.md and TEST_INVENTORY.md" --files ${ANALYZER_FILES}
949
+ fi
950
+ ```
951
+
952
+ **Analyzer (Option 2/3):**
953
+ ```bash
954
+ if [ -f "${output_dir}/GAP_ANALYSIS.md" ]; then
955
+ node bin/qaa-tools.cjs commit "qa(analyzer): produce GAP_ANALYSIS.md" --files ${output_dir}/GAP_ANALYSIS.md
956
+ fi
957
+ ```
958
+
959
+ **TestID Injector (if ran):**
960
+ ```bash
961
+ if [ -f "${output_dir}/TESTID_AUDIT_REPORT.md" ]; then
962
+ node bin/qaa-tools.cjs commit "qa(testid-injector): inject ${injected_count} data-testid attributes across ${component_count} components" --files ${output_dir}/TESTID_AUDIT_REPORT.md ${modified_source_files}
963
+ fi
964
+ ```
965
+
966
+ Where `injected_count` and `component_count` are captured from the testid-injector return in Step 5, and `modified_source_files` are the frontend source files that were modified.
967
+
968
+ **Executor:**
969
+ ```bash
970
+ if [ -n "${generated_file_paths}" ]; then
971
+ node bin/qaa-tools.cjs commit "qa(executor): generate ${total_files} test files with POMs and fixtures" --files ${generated_file_paths}
972
+ fi
973
+ ```
974
+
975
+ Where `total_files` and `generated_file_paths` are captured from the executor return in Step 7.
976
+
977
+ **Validator:**
978
+ ```bash
979
+ if [ -f "${output_dir}/VALIDATION_REPORT.md" ]; then
980
+ node bin/qaa-tools.cjs commit "qa(validator): validate generated tests - ${overall_status} with ${confidence} confidence" --files ${output_dir}/VALIDATION_REPORT.md
981
+ fi
982
+ ```
983
+
984
+ Where `overall_status` and `confidence` are captured from the validator return in Step 8.
985
+
986
+ **Bug Detective (if ran):**
987
+ ```bash
988
+ if [ -f "${output_dir}/FAILURE_CLASSIFICATION_REPORT.md" ]; then
989
+ node bin/qaa-tools.cjs commit "qa(bug-detective): classify ${total_failures} failures - ${classification_summary}" --files ${output_dir}/FAILURE_CLASSIFICATION_REPORT.md
990
+ fi
991
+ ```
992
+
993
+ Where `total_failures` and `classification_summary` (e.g., "2 APP BUG, 2 TEST ERROR, 1 ENV ISSUE") are captured from the bug-detective return in Step 9.
994
+
995
+ ### Sub-step 6: Push branch
996
+
997
+ If `LOCAL_ONLY` is true, skip this sub-step.
998
+
999
+ ```bash
1000
+ git push -u origin "$BRANCH"
1001
+ ```
1002
+
1003
+ If push fails:
1004
+ - Print error: `"Push failed: {error_message}. Artifacts committed locally on branch ${BRANCH}."`
1005
+ - Set `LOCAL_ONLY=true` (skip PR creation but keep local commits).
1006
+ - Do NOT set deliver status to failed -- artifacts are committed locally.
1007
+
1008
+ ### Sub-step 7: Build PR body
1009
+
1010
+ If `LOCAL_ONLY` is true, skip this sub-step.
1011
+
1012
+ **Read the PR template:**
1013
+ ```bash
1014
+ PR_BODY=$(cat templates/pr-template.md)
1015
+ ```
1016
+
1017
+ **Replace all `{placeholder}` tokens with actual values collected during pipeline execution:**
1018
+
1019
+ - `{architecture_type}` -- from QA_ANALYSIS.md architecture overview section (or init context if available). Example: "Next.js full-stack application"
1020
+ - `{framework}` -- from QA_ANALYSIS.md or SCAN_MANIFEST.md framework detection. Example: "Playwright"
1021
+ - `{risk_summary}` -- from QA_ANALYSIS.md risk assessment counts. Example: "3 HIGH, 5 MEDIUM, 2 LOW"
1022
+ - `{unit_count}` -- from `pyramid_breakdown.unit` captured in Step 4 (analyzer return)
1023
+ - `{integration_count}` -- from `pyramid_breakdown.integration`
1024
+ - `{api_count}` -- from `pyramid_breakdown.api`
1025
+ - `{e2e_count}` -- from `pyramid_breakdown.e2e`
1026
+ - `{total_count}` -- from `total_test_count` captured in Step 4
1027
+ - `{modules_covered}` -- count of modules/files with at least one test case in TEST_INVENTORY.md
1028
+ - `{coverage_estimate}` -- estimated coverage percentage from QA_ANALYSIS.md recommended testing pyramid section
1029
+ - `{validation_result}` -- from `overall_status` captured in Step 8 (validator return). PASS, PASS_WITH_WARNINGS, or FAIL
1030
+ - `{confidence}` -- from `confidence` captured in Step 8. HIGH, MEDIUM, or LOW
1031
+ - `{fix_loops_used}` -- from `fix_loops_used` captured in Step 8. Number 0-3
1032
+ - `{issues_found}` -- from `issues_found` captured in Step 8
1033
+ - `{issues_fixed}` -- from `issues_fixed` captured in Step 8
1034
+ - `{file_list}` -- if total generated files <= 50, list each file as `- {path}`. If > 50, use summary: `{N} files across {M} directories`
1035
+
1036
+ All replacements use simple string substitution. The PR body must be in English.
1037
+
1038
+ ### Sub-step 8: Create draft PR
1039
+
1040
+ If `LOCAL_ONLY` is true, skip this sub-step.
1041
+
1042
+ ```bash
1043
+ PR_URL=$(gh pr create \
1044
+ --draft \
1045
+ --title "qa: automated test suite for ${PROJECT_NAME}" \
1046
+ --body "${PR_BODY}" \
1047
+ --label "qa-automation" \
1048
+ --label "auto-generated" \
1049
+ --assignee "@me" 2>&1)
1050
+ ```
1051
+
1052
+ **Do NOT pass `--base` flag.** Let gh auto-detect the default branch to avoid errors when the default branch is not named "main".
1053
+
1054
+ **On success:** Capture the PR URL from stdout. The URL is the last line of `gh pr create` output.
1055
+
1056
+ **On failure:**
1057
+ - Print error: `"PR creation failed: ${PR_URL}. Artifacts remain on branch ${BRANCH}."`
1058
+ - Set deliver status to failed:
1059
+ ```bash
1060
+ node bin/qaa-tools.cjs state patch --"Deliver Status" failed --"Status" "Deliver failed: PR creation error"
1061
+ ```
1062
+ - Do NOT stop the pipeline -- artifacts are committed and pushed. The QA engineer can create the PR manually.
1063
+
1064
+ ### Sub-step 9: Print PR URL
1065
+
1066
+ If PR was created successfully:
1067
+ ```
1068
+ PR created: ${PR_URL}
1069
+ ```
1070
+
1071
+ If `LOCAL_ONLY` is true:
1072
+ ```
1073
+ PR: not created (local-only mode). Artifacts committed on branch: ${BRANCH}
1074
+ ```
1075
+
1076
+ Store `PR_URL` (or the local-only message) for inclusion in the pipeline summary banner.
1077
+
1078
+ **State update -- mark deliver as complete (on success):**
1079
+ ```bash
1080
+ node bin/qaa-tools.cjs state patch --"Deliver Status" complete --"Status" "Pipeline complete"
1081
+ ```
1082
+
1083
+ **Clear auto-chain flag at pipeline completion:**
1084
+ ```bash
1085
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
1086
+ ```
1087
+ </step>
1088
+
1089
+ </process>
1090
+
1091
+ <auto_advance>
1092
+ ## Auto-Advance Mode
1093
+
1094
+ Auto-advance is enabled when ANY of these is true:
1095
+ - config.json `workflow.auto_advance = true` (persistent user preference)
1096
+ - `--auto` flag passed to orchestrator invocation (per-run override)
1097
+ - `workflow._auto_chain_active = true` in config (ephemeral chain flag from ongoing auto run)
1098
+
1099
+ ### Behavior in Auto Mode
1100
+
1101
+ **Safe checkpoints are auto-approved.** The pipeline continues without pausing. A log message records the auto-approval:
1102
+ ```
1103
+ Auto-approved: {checkpoint_description}
1104
+ ```
1105
+
1106
+ **Risky checkpoints ALWAYS pause.** Even in auto mode, the pipeline stops and presents the checkpoint to the user. This is a locked decision -- unresolved validation issues and application bugs require human judgment.
1107
+
1108
+ **Full progress banners shown** even in auto mode -- user sees pipeline flowing in terminal with stage banners, agent spawning indicators, and completion messages. Auto mode does not suppress output.
1109
+
1110
+ **On stage failure in auto mode:** STOP PIPELINE ENTIRELY. Report which stage failed and why. No partial PR. User must intervene.
1111
+
1112
+ ### Safe vs Risky Checkpoint Classification
1113
+
1114
+ See the `<checkpoint_system>` section below for the complete classification table and handling flow.
1115
+
1116
+ ### Stale Chain Flag Protection
1117
+
1118
+ At orchestrator init, if `--auto` was NOT passed:
1119
+ ```bash
1120
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
1121
+ ```
1122
+ This prevents a previous interrupted `--auto` run from causing unexpected auto-advance in a new manual session.
1123
+
1124
+ ### Auto Mode Persistence
1125
+
1126
+ When `--auto` is passed:
1127
+ ```bash
1128
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
1129
+ ```
1130
+ This flag persists across agent spawns within the same pipeline run. Each spawned agent can check it to maintain auto-advance behavior through the chain.
1131
+
1132
+ At pipeline completion (success or failure), clear the chain flag:
1133
+ ```bash
1134
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
1135
+ ```
1136
+ </auto_advance>
1137
+
1138
+ <checkpoint_system>
1139
+ ## Checkpoint System
1140
+
1141
+ ### Checkpoint Classification
1142
+
1143
+ Every agent may return checkpoint data when it encounters a situation requiring human input. The orchestrator classifies each checkpoint as SAFE or RISKY and handles it accordingly.
1144
+
1145
+ **SAFE checkpoints (auto-approve in auto mode):**
1146
+
1147
+ | Checkpoint | Agent | Why Safe | Auto-Action |
1148
+ |------------|-------|----------|-------------|
1149
+ | Framework detection uncertain (LOW confidence) | Scanner | Auto-select most likely framework; analysis can continue with reasonable default | Approve with most likely framework |
1150
+ | Analyzer assumptions review | Analyzer | Assumptions are informational; incorrect assumptions produce suboptimal but not broken output | Approve all assumptions |
1151
+ | TestID audit review | TestID Injector | P0-only injection is conservative; only forms, buttons, and primary actions receive test IDs | Approve P0-only injection |
1152
+
1153
+ **RISKY checkpoints (ALWAYS pause, even in auto mode):**
1154
+
1155
+ | Checkpoint | Agent | Why Risky | User Action Required |
1156
+ |------------|-------|-----------|---------------------|
1157
+ | Validator escalation (unresolved issues after 3 fix loops) | Validator | Unresolved issues mean tests may be broken; delivering broken tests defeats the purpose | User decides: approve-with-warnings, abort, or provide fix guidance |
1158
+ | APPLICATION BUG classification | Bug Detective | Genuine bugs in application code require developer action, not auto-fix | User reviews bug evidence and decides whether to continue or fix first |
1159
+ | Any checkpoint with `blocking` containing "unresolved" or "failed" | Any agent | Indicates pipeline integrity risk; proceeding could produce incorrect artifacts | User reviews the specific blocking issue |
1160
+
1161
+ ### Checkpoint Handling Flow
1162
+
1163
+ ```
1164
+ On agent return with checkpoint data:
1165
+ 1. Extract checkpoint `blocking` field content
1166
+ 2. Classify as SAFE or RISKY:
1167
+ - Match against safe patterns:
1168
+ "framework detection" -> SAFE
1169
+ "assumptions" -> SAFE
1170
+ "audit" or "data-testid" -> SAFE
1171
+ - Match against risky patterns:
1172
+ "unresolved" -> RISKY
1173
+ "failed" -> RISKY
1174
+ "APPLICATION BUG" -> RISKY
1175
+ - Default (no pattern match) -> RISKY (conservative)
1176
+ 3. If IS_AUTO and SAFE:
1177
+ - Auto-approve with default action
1178
+ - Log: "Auto-approved: {checkpoint_description}"
1179
+ - Continue pipeline to next stage
1180
+ 4. If IS_AUTO and RISKY:
1181
+ - PAUSE pipeline
1182
+ - Print checkpoint details with full context:
1183
+ - What stage triggered the checkpoint
1184
+ - What was completed so far
1185
+ - The specific blocking issue
1186
+ - What artifacts have been produced
1187
+ - Wait for user input
1188
+ - On user response: spawn fresh continuation agent
1189
+ 5. If NOT auto (manual mode):
1190
+ - PAUSE pipeline
1191
+ - Print checkpoint details with full context
1192
+ - Wait for user input
1193
+ - On user response: spawn fresh continuation agent
1194
+ ```
1195
+
1196
+ ### Resume After Checkpoint
1197
+
1198
+ When resuming after a checkpoint, spawn a FRESH agent (not serialized state). This follows the GSD pattern: fresh agent with explicit state is more reliable than serialized continuation.
1199
+
1200
+ ```
1201
+ Task(
1202
+ prompt="
1203
+ <objective>Continue QA pipeline from {stage} stage</objective>
1204
+ <execution_context>@agents/qa-pipeline-orchestrator.md</execution_context>
1205
+ <resume_context>
1206
+ Pipeline state:
1207
+ - Completed stages: {list of completed stages with their results}
1208
+ - Current stage: {stage that triggered checkpoint}
1209
+ - Checkpoint response: {user's response or decision}
1210
+ - Artifacts produced so far: {list of files with paths}
1211
+
1212
+ Resume from: {exact step in pipeline to resume from}
1213
+ User decision: {what user chose at checkpoint}
1214
+ </resume_context>
1215
+ "
1216
+ )
1217
+ ```
1218
+
1219
+ The continuation agent reads this resume_context, verifies the completed stages by checking artifact existence on disk, and continues from the specified point. It does NOT re-execute completed stages.
1220
+
1221
+ ### Checkpoint Return Structure
1222
+
1223
+ Agents return checkpoints in this structure:
1224
+ ```
1225
+ CHECKPOINT_RETURN:
1226
+ completed: "What has been done so far"
1227
+ blocking: "What is blocking progress"
1228
+ details: "Detailed context about the blocking issue"
1229
+ awaiting: "What the user needs to do or provide"
1230
+ ```
1231
+
1232
+ The orchestrator parses the `blocking` field to classify the checkpoint.
1233
+ </checkpoint_system>
1234
+
1235
+ <error_handling>
1236
+ ## Error Handling
1237
+
1238
+ ### Stage Failure Protocol
1239
+
1240
+ When any agent returns a failure or error:
1241
+
1242
+ 1. **Set stage status to `failed`:**
1243
+ ```bash
1244
+ node bin/qaa-tools.cjs state patch --"{Stage} Status" failed --"Status" "Pipeline stopped: {Stage} failed - {reason}"
1245
+ ```
1246
+
1247
+ 2. **Print failure banner:**
1248
+ ```
1249
+ !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1250
+ ! PIPELINE STOPPED !
1251
+ ! Stage: {stage_name} !
1252
+ ! Reason: {failure_reason} !
1253
+ ! !
1254
+ ! Completed: {completed_stages} !
1255
+ ! Artifacts: {artifacts_so_far} !
1256
+ ! !
1257
+ ! Action required: Review and re-run !
1258
+ !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1259
+ ```
1260
+
1261
+ 3. **DO NOT continue to next stage.** The pipeline stops entirely at the failed stage.
1262
+
1263
+ 4. **DO NOT create partial PR.** No branch, no commit, no PR with incomplete results.
1264
+
1265
+ 5. **Preserve all artifacts produced so far.** They may be useful for debugging the failure. Artifacts from completed stages remain on disk in `{output_dir}/`.
1266
+
1267
+ ### Agent Return Validation
1268
+
1269
+ After EVERY agent spawn, before advancing to next stage:
1270
+
1271
+ 1. **Check the return for error/stop conditions:**
1272
+ - Scanner: Check `decision` field -- if `STOP`, pipeline stops
1273
+ - Validator: Check `overall_status` -- if `FAIL` with unresolved issues, checkpoint triggers
1274
+ - Bug Detective: Check `classification_breakdown.app_bug` -- if > 0, checkpoint triggers
1275
+ - Any agent: Check for error messages, empty returns, or missing expected fields
1276
+
1277
+ 2. **Verify expected output artifacts exist on disk:**
1278
+ ```bash
1279
+ [ -f "{expected_artifact_path}" ] && echo "OK" || echo "MISSING"
1280
+ ```
1281
+ - Scanner: `{output_dir}/SCAN_MANIFEST.md` must exist
1282
+ - Analyzer: `{output_dir}/QA_ANALYSIS.md` (Option 1) or `{output_dir}/GAP_ANALYSIS.md` (Options 2/3) must exist
1283
+ - Planner: `{output_dir}/GENERATION_PLAN.md` must exist
1284
+ - Executor: All planned test files must exist
1285
+ - Validator: `{output_dir}/VALIDATION_REPORT.md` must exist
1286
+
1287
+ 3. **If artifacts missing:** Treat as stage failure. Set status to failed and stop pipeline.
1288
+
1289
+ ### Retry Policy
1290
+
1291
+ The orchestrator does NOT retry failed agents automatically. If a stage fails:
1292
+
1293
+ - **In auto mode:** Stop pipeline entirely and report the failure. Print which stage failed, what error occurred, and what artifacts were produced before failure.
1294
+ - **In manual mode:** Stop and present the failure to user. User can choose to:
1295
+ - Retry the failed stage (orchestrator spawns the same agent again)
1296
+ - Abort the pipeline
1297
+ - Provide guidance and retry with modifications
1298
+ </error_handling>
1299
+
1300
+ <pipeline_summary>
1301
+ ## Pipeline Summary
1302
+
1303
+ After all stages complete (or on pipeline stop), print a summary banner:
1304
+
1305
+ ```
1306
+ ======================================================
1307
+ QA PIPELINE COMPLETE
1308
+ ======================================================
1309
+
1310
+ Option: {option} ({option_description})
1311
+ Repository: {dev_repo_path}
1312
+ QA Repo: {qa_repo_path or 'N/A'}
1313
+ Maturity Score: {maturity_score or 'N/A'}
1314
+
1315
+ Stages Completed:
1316
+ [{check}] Scan -- {scan_duration} {scan_extra}
1317
+ [{check}] Analyze -- {analyze_duration} ({test_count} test cases)
1318
+ [{check}] TestID Inject-- {inject_duration or 'skipped'}
1319
+ [{check}] Plan -- {plan_duration} ({file_count} files planned)
1320
+ [{check}] Generate -- {generate_duration} ({files_created} files created)
1321
+ [{check}] Validate -- {validate_duration} ({confidence} confidence)
1322
+ [{check}] Bug Detective-- {detective_duration or 'skipped'}
1323
+ [{check}] Deliver -- {deliver_duration}
1324
+
1325
+ PR: {pr_url or 'not created (local-only)'}
1326
+
1327
+ Artifacts:
1328
+ {list all produced .md files in output_dir}
1329
+
1330
+ Total Time: {total_duration}
1331
+ ======================================================
1332
+ ```
1333
+
1334
+ Where:
1335
+ - `[x]` = stage completed successfully
1336
+ - `[ ]` = stage skipped (testid-inject when no frontend, bug-detective when no failures)
1337
+ - `[!]` = stage failed
1338
+
1339
+ **On pipeline failure:** The summary still prints, but shows which stages completed and which failed, along with the failure reason.
1340
+
1341
+ **Artifact list includes:**
1342
+ - SCAN_MANIFEST.md (always)
1343
+ - QA_ANALYSIS.md (Option 1) or GAP_ANALYSIS.md (Options 2/3)
1344
+ - TEST_INVENTORY.md (Option 1)
1345
+ - QA_REPO_BLUEPRINT.md (Option 1, if produced)
1346
+ - TESTID_AUDIT_REPORT.md (if frontend detected)
1347
+ - GENERATION_PLAN.md (if plan stage completed)
1348
+ - Generated test files (if generate stage completed)
1349
+ - VALIDATION_REPORT.md (if validate stage completed)
1350
+ - FAILURE_CLASSIFICATION_REPORT.md (if bug detective ran)
1351
+ </pipeline_summary>
1352
+
1353
+ <quality_gate>
1354
+ ## Quality Gate
1355
+
1356
+ Before this orchestrator is considered complete, verify:
1357
+
1358
+ - [ ] All 3 workflow options route to correct stage sequences:
1359
+ - Option 1: scan(dev) -> analyze(full) -> [testid-inject] -> plan -> generate -> validate -> [bug-detective] -> deliver
1360
+ - Option 2: scan(both) -> analyze(gap) -> [testid-inject] -> plan(gap) -> generate(gap) -> validate -> [bug-detective] -> deliver
1361
+ - Option 3: scan(both) -> analyze(gap) -> [testid-inject] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective] -> deliver
1362
+ - [ ] Every agent spawn is bracketed by state updates (running before, complete/failed after)
1363
+ - [ ] Auto-advance correctly classifies safe vs risky checkpoints
1364
+ - [ ] Pipeline stops entirely on any stage failure (no partial PR)
1365
+ - [ ] Progress banners print for every stage even in auto mode
1366
+ - [ ] Deliver stage creates branch, commits per-stage, pushes, and creates draft PR via gh CLI
1367
+ - [ ] Resume spawns fresh agent with explicit state (no serialization)
1368
+ </quality_gate>
1369
+
1370
+ <success_criteria>
1371
+ ## Success Criteria
1372
+
1373
+ 1. QA engineer can invoke orchestrator and pipeline runs through all stages for their repo type
1374
+ 2. Option detection is automatic based on repo count and maturity scoring
1375
+ 3. Pipeline state in STATE.md accurately reflects progress at every point
1376
+ 4. Checkpoints pause when appropriate and auto-approve when safe
1377
+ 5. Failure in any stage stops the pipeline cleanly with actionable error message
1378
+ </success_criteria>