qaa-agent 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (56) hide show
  1. package/.claude/commands/create-test.md +40 -0
  2. package/.claude/commands/qa-analyze.md +60 -0
  3. package/.claude/commands/qa-audit.md +37 -0
  4. package/.claude/commands/qa-blueprint.md +54 -0
  5. package/.claude/commands/qa-fix.md +36 -0
  6. package/.claude/commands/qa-from-ticket.md +88 -0
  7. package/.claude/commands/qa-gap.md +54 -0
  8. package/.claude/commands/qa-pom.md +36 -0
  9. package/.claude/commands/qa-pyramid.md +37 -0
  10. package/.claude/commands/qa-report.md +38 -0
  11. package/.claude/commands/qa-start.md +33 -0
  12. package/.claude/commands/qa-testid.md +54 -0
  13. package/.claude/commands/qa-validate.md +54 -0
  14. package/.claude/commands/update-test.md +58 -0
  15. package/.claude/settings.json +19 -0
  16. package/.claude/skills/qa-bug-detective/SKILL.md +122 -0
  17. package/.claude/skills/qa-repo-analyzer/SKILL.md +88 -0
  18. package/.claude/skills/qa-self-validator/SKILL.md +109 -0
  19. package/.claude/skills/qa-template-engine/SKILL.md +113 -0
  20. package/.claude/skills/qa-testid-injector/SKILL.md +93 -0
  21. package/.claude/skills/qa-workflow-documenter/SKILL.md +87 -0
  22. package/CLAUDE.md +543 -0
  23. package/README.md +418 -0
  24. package/agents/qa-pipeline-orchestrator.md +1217 -0
  25. package/agents/qaa-analyzer.md +508 -0
  26. package/agents/qaa-bug-detective.md +444 -0
  27. package/agents/qaa-executor.md +618 -0
  28. package/agents/qaa-planner.md +374 -0
  29. package/agents/qaa-scanner.md +422 -0
  30. package/agents/qaa-testid-injector.md +583 -0
  31. package/agents/qaa-validator.md +450 -0
  32. package/bin/install.cjs +176 -0
  33. package/bin/lib/commands.cjs +709 -0
  34. package/bin/lib/config.cjs +307 -0
  35. package/bin/lib/core.cjs +497 -0
  36. package/bin/lib/frontmatter.cjs +299 -0
  37. package/bin/lib/init.cjs +989 -0
  38. package/bin/lib/milestone.cjs +241 -0
  39. package/bin/lib/model-profiles.cjs +60 -0
  40. package/bin/lib/phase.cjs +911 -0
  41. package/bin/lib/roadmap.cjs +306 -0
  42. package/bin/lib/state.cjs +748 -0
  43. package/bin/lib/template.cjs +222 -0
  44. package/bin/lib/verify.cjs +842 -0
  45. package/bin/qaa-tools.cjs +607 -0
  46. package/package.json +34 -0
  47. package/templates/failure-classification.md +391 -0
  48. package/templates/gap-analysis.md +409 -0
  49. package/templates/pr-template.md +48 -0
  50. package/templates/qa-analysis.md +381 -0
  51. package/templates/qa-audit-report.md +465 -0
  52. package/templates/qa-repo-blueprint.md +636 -0
  53. package/templates/scan-manifest.md +312 -0
  54. package/templates/test-inventory.md +582 -0
  55. package/templates/testid-audit-report.md +354 -0
  56. package/templates/validation-report.md +243 -0
@@ -0,0 +1,1217 @@
1
+ <purpose>
2
+ Single orchestrator for the QA automation pipeline. Coordinates all 7 agent types (scanner, analyzer, planner, executor, validator, bug-detective, testid-injector) across 3 workflow options. Owns all pipeline state transitions -- agents never update state directly. The orchestrator sets stage status to 'running' before spawning an agent and 'complete' or 'failed' after the agent returns.
3
+
4
+ Invoked by the `/qa-start` slash command (Phase 6) or directly via Task() with this file as execution_context. Accepts 0-2 repo paths: 0 paths uses cwd as dev repo (Option 1), 1 path is dev-only (Option 1), 2 paths triggers maturity scoring to determine Option 2 or 3.
5
+
6
+ **Pipeline stages in order:**
7
+ ```
8
+ scan -> analyze -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
9
+ ```
10
+
11
+ **Workflow options:**
12
+ - Option 1: Dev-only repo -- full pipeline from scratch
13
+ - Option 2: Dev + immature QA repo -- gap-fill and standardize
14
+ - Option 3: Dev + mature QA repo -- surgical additions only
15
+ </purpose>
16
+
17
+ <required_reading>
18
+ Read these files BEFORE executing any pipeline stage. Do NOT skip.
19
+
20
+ - **CLAUDE.md** -- Agent pipeline stages, module boundaries, quality gates, stage transitions, auto-advance rules, agent coordination, data-testid convention. Read the full file.
21
+ - **.planning/STATE.md** -- Current pipeline state. Check scan_status, analyze_status, generate_status, validate_status, deliver_status fields.
22
+ - **.planning/config.json** -- Workflow configuration: auto_advance flag, parallelization flag, mode, commit_docs.
23
+ </required_reading>
24
+
25
+ <process>
26
+
27
+ <step name="initialize">
28
+ ## Step 1: Initialize Pipeline
29
+
30
+ Call `qaa-tools.cjs init qa-start` to bootstrap the full workflow context.
31
+
32
+ ```bash
33
+ INIT_JSON=$(node bin/qaa-tools.cjs init qa-start)
34
+ ```
35
+
36
+ Parse the JSON to extract all required fields:
37
+
38
+ ```
39
+ option -- 1, 2, or 3 (workflow routing)
40
+ dev_repo_path -- path to the developer repository
41
+ qa_repo_path -- path to existing QA repository (null for Option 1)
42
+ maturity_score -- 0-100 QA repo quality score (null for Option 1)
43
+ maturity_note -- descriptive note about maturity assessment (null for Option 1)
44
+ output_dir -- ".qa-output" (where agents write artifacts)
45
+ date -- "YYYY-MM-DD" for branch naming and timestamps
46
+
47
+ scanner_model -- model for scanner agent
48
+ analyzer_model -- model for analyzer agent
49
+ planner_model -- model for planner agent
50
+ executor_model -- model for executor agent
51
+ validator_model -- model for validator agent
52
+ detective_model -- model for bug-detective agent
53
+ injector_model -- model for testid-injector agent
54
+
55
+ auto_advance -- persistent config flag (boolean)
56
+ auto_chain_active -- ephemeral chain flag (boolean)
57
+ parallelization -- parallelization config value
58
+ commit_docs -- whether to commit documentation artifacts
59
+ ```
60
+
61
+ **Determine auto-advance mode:**
62
+
63
+ ```bash
64
+ IS_AUTO=false
65
+
66
+ # Check persistent config flag
67
+ if auto_advance is true OR auto_chain_active is true; then
68
+ IS_AUTO=true
69
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
70
+ fi
71
+
72
+ # Check if --auto was passed as argument to orchestrator invocation
73
+ if --auto flag was passed; then
74
+ IS_AUTO=true
75
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
76
+ fi
77
+ ```
78
+
79
+ **Safety: Clear stale auto-chain flag** -- if NOT in auto mode, clear the ephemeral flag to prevent a previous interrupted `--auto` run from causing unexpected auto-advance:
80
+
81
+ ```bash
82
+ if IS_AUTO is false:
83
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
84
+ ```
85
+
86
+ **Print initialization banner:**
87
+
88
+ ```
89
+ === QA Pipeline Orchestrator ===
90
+ Option: {option} ({description})
91
+ Dev Repo: {dev_repo_path}
92
+ QA Repo: {qa_repo_path or 'N/A'}
93
+ Maturity Score: {maturity_score or 'N/A'}
94
+ Auto-Advance: {IS_AUTO}
95
+ Date: {date}
96
+ ================================
97
+ ```
98
+
99
+ Where `{description}` is:
100
+ - Option 1: "Dev-Only -- Full Pipeline"
101
+ - Option 2: "Dev + Immature QA -- Gap-Fill"
102
+ - Option 3: "Dev + Mature QA -- Surgical"
103
+ </step>
104
+
105
+ <step name="route_by_option">
106
+ ## Step 2: Route by Option
107
+
108
+ Based on `option` value from init, select the stage sequence. Each option shares the same core pipeline but differs in how agents are parameterized and what artifacts they produce.
109
+
110
+ **Option 1 stages:**
111
+ ```
112
+ scan(dev) -> analyze(full) -> [testid-inject if frontend] -> plan -> generate -> validate -> [bug-detective if failures] -> deliver
113
+ ```
114
+ - Scanner: scan DEV repo only
115
+ - Analyzer: mode='full' (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
116
+ - Planner: reads TEST_INVENTORY.md + QA_ANALYSIS.md
117
+ - Executor: generates all planned test files
118
+ - All stages run against DEV repo artifacts
119
+
120
+ **Option 2 stages:**
121
+ ```
122
+ scan(both) -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(gap) -> validate -> [bug-detective if failures] -> deliver
123
+ ```
124
+ - Scanner: scan BOTH dev_repo_path and qa_repo_path
125
+ - Analyzer: mode='gap' (produces GAP_ANALYSIS.md)
126
+ - Planner: reads GAP_ANALYSIS.md (fix broken first, then add missing, then standardize)
127
+ - Executor: generates fixed test files + new test files + standardized files
128
+ - All stages aware of existing QA repo structure
129
+
130
+ **Option 3 stages:**
131
+ ```
132
+ scan(both) -> analyze(gap) -> [testid-inject if frontend] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective if failures] -> deliver
133
+ ```
134
+ - Scanner: scan BOTH dev_repo_path and qa_repo_path
135
+ - Analyzer: mode='gap' (produces GAP_ANALYSIS.md with thin areas only)
136
+ - Planner: reads GAP_ANALYSIS.md (missing tests only)
137
+ - Executor: passes `skip_existing_test_ids: true` so it checks existing test files by test ID before generating -- skips tests that already exist
138
+ - Only new test files are generated; existing working tests are left untouched
139
+
140
+ **Shared stages across all options:**
141
+ - TestID injection: conditional on `has_frontend` from scanner return
142
+ - Validation: always runs on generated files
143
+ - Bug detective: conditional on test failures
144
+ - Deliver: always runs (branch creation, per-stage commits, push, draft PR via gh CLI)
145
+ </step>
146
+
147
+ <step name="execute_scan">
148
+ ## Step 3: Execute Scan Stage
149
+
150
+ **State update -- mark scan as running:**
151
+ ```bash
152
+ node bin/qaa-tools.cjs state patch --"Scan Status" running --"Status" "Scanning repository"
153
+ ```
154
+
155
+ **Print stage banner:**
156
+ ```
157
+ +------------------------------------------+
158
+ | STAGE 1: Scanner |
159
+ | Status: Running... |
160
+ +------------------------------------------+
161
+ ```
162
+
163
+ **Spawn scanner agent via Task():**
164
+
165
+ For **Option 1** -- scan DEV repo only:
166
+ ```
167
+ Task(
168
+ prompt="
169
+ <objective>Scan repository and produce SCAN_MANIFEST.md</objective>
170
+ <execution_context>@agents/qaa-scanner.md</execution_context>
171
+ <files_to_read>
172
+ - CLAUDE.md
173
+ </files_to_read>
174
+ <parameters>
175
+ dev_repo_path: {dev_repo_path}
176
+ qa_repo_path: null
177
+ output_path: {output_dir}/SCAN_MANIFEST.md
178
+ </parameters>
179
+ "
180
+ )
181
+ ```
182
+
183
+ For **Options 2 and 3** -- scan BOTH repos:
184
+ ```
185
+ Task(
186
+ prompt="
187
+ <objective>Scan both developer and QA repositories and produce SCAN_MANIFEST.md</objective>
188
+ <execution_context>@agents/qaa-scanner.md</execution_context>
189
+ <files_to_read>
190
+ - CLAUDE.md
191
+ </files_to_read>
192
+ <parameters>
193
+ dev_repo_path: {dev_repo_path}
194
+ qa_repo_path: {qa_repo_path}
195
+ output_path: {output_dir}/SCAN_MANIFEST.md
196
+ </parameters>
197
+ "
198
+ )
199
+ ```
200
+
201
+ **Parse scanner return:**
202
+
203
+ Expected return structure:
204
+ ```
205
+ SCANNER_COMPLETE:
206
+ file_path: ".qa-output/SCAN_MANIFEST.md"
207
+ decision: PROCEED | STOP
208
+ has_frontend: true | false
209
+ detection_confidence: HIGH | MEDIUM | LOW
210
+ ```
211
+
212
+ **Handle decision field:**
213
+
214
+ - If `decision` is `STOP`:
215
+ ```bash
216
+ node bin/qaa-tools.cjs state patch --"Scan Status" failed --"Status" "Pipeline stopped: Scanner returned STOP"
217
+ ```
218
+ Print failure banner and STOP PIPELINE ENTIRELY. Do NOT proceed to any further stage.
219
+
220
+ - If `decision` is `PROCEED`:
221
+ ```bash
222
+ node bin/qaa-tools.cjs state patch --"Scan Status" complete
223
+ ```
224
+ Capture `has_frontend` for testid-injector conditional.
225
+ Capture `detection_confidence` for checkpoint handling.
226
+
227
+ **Handle scanner checkpoint -- framework detection uncertain:**
228
+
229
+ If `detection_confidence` is `LOW`:
230
+ - If `IS_AUTO` is true: Auto-approve with most likely framework (SAFE checkpoint). Log: "Auto-approved: Scanner framework detection (LOW confidence, selected most likely framework)". Continue pipeline.
231
+ - If `IS_AUTO` is false: Present the detection details to user. Wait for confirmation. On user response, spawn fresh continuation agent with user's framework choice.
232
+ </step>
233
+
234
+ <step name="execute_analyze">
235
+ ## Step 4: Execute Analyze Stage
236
+
237
+ **State update -- mark analyze as running:**
238
+ ```bash
239
+ node bin/qaa-tools.cjs state patch --"Analyze Status" running --"Status" "Analyzing repository"
240
+ ```
241
+
242
+ **Print stage banner:**
243
+ ```
244
+ +------------------------------------------+
245
+ | STAGE 2: Analyzer |
246
+ | Status: Running... |
247
+ +------------------------------------------+
248
+ ```
249
+
250
+ **Determine analyzer mode based on option:**
251
+ - Option 1: `mode = 'full'` (produces QA_ANALYSIS.md + TEST_INVENTORY.md + QA_REPO_BLUEPRINT.md)
252
+ - Options 2 and 3: `mode = 'gap'` (produces GAP_ANALYSIS.md)
253
+
254
+ **Spawn analyzer agent via Task():**
255
+ ```
256
+ Task(
257
+ prompt="
258
+ <objective>Analyze scanned repository and produce analysis artifacts</objective>
259
+ <execution_context>@agents/qaa-analyzer.md</execution_context>
260
+ <files_to_read>
261
+ - {output_dir}/SCAN_MANIFEST.md
262
+ - CLAUDE.md
263
+ </files_to_read>
264
+ <parameters>
265
+ mode: {mode}
266
+ workflow_option: {option}
267
+ dev_repo_path: {dev_repo_path}
268
+ qa_repo_path: {qa_repo_path or null}
269
+ output_path: {output_dir}/
270
+ </parameters>
271
+ "
272
+ )
273
+ ```
274
+
275
+ **Parse analyzer return:**
276
+
277
+ Expected return structure:
278
+ ```
279
+ ANALYZER_COMPLETE:
280
+ files_produced: [...]
281
+ total_test_count: N
282
+ pyramid_breakdown: {unit: N, integration: N, api: N, e2e: N}
283
+ risk_count: {high: N, medium: N, low: N}
284
+ commit_hash: "..."
285
+ ```
286
+
287
+ Capture `files_produced`, `total_test_count`, `pyramid_breakdown` for downstream stages.
288
+
289
+ **Handle analyzer checkpoint -- assumptions review:**
290
+
291
+ If the analyzer returns a checkpoint with assumptions:
292
+ - If `IS_AUTO` is true: Auto-approve all assumptions (SAFE checkpoint). Log: "Auto-approved: Analyzer assumptions". Continue pipeline.
293
+ - If `IS_AUTO` is false: Present assumptions to user for review. Wait for confirmation or corrections. On user response, spawn fresh continuation agent incorporating any corrections.
294
+
295
+ **State update -- mark analyze as complete:**
296
+ ```bash
297
+ node bin/qaa-tools.cjs state patch --"Analyze Status" complete
298
+ ```
299
+
300
+ Print completion message: "Analysis complete. {total_test_count} test cases identified. Pyramid: {pyramid_breakdown}."
301
+ </step>
302
+
303
+ <step name="execute_testid_inject">
304
+ ## Step 5: Execute TestID Injection Stage (Conditional)
305
+
306
+ **Condition:** Only execute if `has_frontend` is `true` from scanner return (Step 3).
307
+
308
+ **If `has_frontend` is false:**
309
+ Print: "Skipping TestID injection (no frontend detected)." and proceed directly to Step 6 (Plan).
310
+
311
+ **If `has_frontend` is true:**
312
+
313
+ **State update:**
314
+ ```bash
315
+ node bin/qaa-tools.cjs state patch --"Status" "Injecting test IDs into frontend components"
316
+ ```
317
+
318
+ **Print stage banner:**
319
+ ```
320
+ +------------------------------------------+
321
+ | STAGE 3: TestID Injector |
322
+ | Status: Running... |
323
+ +------------------------------------------+
324
+ ```
325
+
326
+ **Spawn testid-injector agent via Task():**
327
+ ```
328
+ Task(
329
+ prompt="
330
+ <objective>Audit and inject data-testid attributes into frontend components</objective>
331
+ <execution_context>@agents/qaa-testid-injector.md</execution_context>
332
+ <files_to_read>
333
+ - {output_dir}/SCAN_MANIFEST.md
334
+ - CLAUDE.md
335
+ </files_to_read>
336
+ <parameters>
337
+ dev_repo_path: {dev_repo_path}
338
+ output_path: {output_dir}/TESTID_AUDIT_REPORT.md
339
+ </parameters>
340
+ "
341
+ )
342
+ ```
343
+
344
+ **Parse return:**
345
+
346
+ Check for `INJECTOR_COMPLETE` vs `INJECTOR_SKIPPED`:
347
+
348
+ If `INJECTOR_COMPLETE`:
349
+ ```
350
+ INJECTOR_COMPLETE:
351
+ report_path: "..."
352
+ coverage_before: N%
353
+ coverage_after: N%
354
+ elements_injected: N
355
+ ...
356
+ ```
357
+ Log: "TestID injection complete. Coverage: {coverage_before}% -> {coverage_after}%. {elements_injected} elements injected."
358
+
359
+ If `INJECTOR_SKIPPED`:
360
+ ```
361
+ INJECTOR_SKIPPED:
362
+ reason: "..."
363
+ action: "..."
364
+ ```
365
+ Log the reason and continue pipeline.
366
+
367
+ **Handle injector checkpoint -- audit review:**
368
+ - If `IS_AUTO` is true: Auto-approve P0-only injection (SAFE checkpoint). Log: "Auto-approved: TestID injection (P0 elements only)". Continue pipeline.
369
+ - If `IS_AUTO` is false: Present audit report to user. Wait for approval, element selection, or rejection. On user response, spawn fresh continuation agent with user's approved elements.
370
+ </step>
371
+
372
+ <step name="execute_plan">
373
+ ## Step 6: Execute Plan Stage
374
+
375
+ **State update -- mark generation as running (planning is part of generate):**
376
+ ```bash
377
+ node bin/qaa-tools.cjs state patch --"Generate Status" running --"Status" "Planning test generation"
378
+ ```
379
+
380
+ **Print stage banner:**
381
+ ```
382
+ +------------------------------------------+
383
+ | STAGE 4: Planner |
384
+ | Status: Running... |
385
+ +------------------------------------------+
386
+ ```
387
+
388
+ **Determine planner input based on option:**
389
+ - Option 1: Input from `{output_dir}/TEST_INVENTORY.md` + `{output_dir}/QA_ANALYSIS.md`
390
+ - Options 2 and 3: Input from `{output_dir}/GAP_ANALYSIS.md`
391
+
392
+ **Spawn planner agent via Task():**
393
+ ```
394
+ Task(
395
+ prompt="
396
+ <objective>Create test generation plan with task breakdown and dependencies</objective>
397
+ <execution_context>@agents/qaa-planner.md</execution_context>
398
+ <files_to_read>
399
+ - {input files based on option -- see above}
400
+ - CLAUDE.md
401
+ </files_to_read>
402
+ <parameters>
403
+ workflow_option: {option}
404
+ output_path: {output_dir}/GENERATION_PLAN.md
405
+ </parameters>
406
+ "
407
+ )
408
+ ```
409
+
410
+ **Parse planner return:**
411
+
412
+ Expected return structure:
413
+ ```
414
+ PLANNER_COMPLETE:
415
+ file_path: "..."
416
+ total_tasks: N
417
+ total_files: N
418
+ feature_count: N
419
+ dependency_depth: N
420
+ test_case_count: N
421
+ commit_hash: "..."
422
+ ```
423
+
424
+ Capture `total_tasks`, `total_files`, `feature_count` for executor stage and pipeline summary.
425
+
426
+ Print: "Plan complete. {total_tasks} tasks, {total_files} files planned across {feature_count} features."
427
+ </step>
428
+
429
+ <step name="execute_generate">
430
+ ## Step 7: Execute Generate Stage (FLOW-04 -- Wave-based Parallel Execution)
431
+
432
+ State update continues from planning (already set to `running` in Step 6).
433
+
434
+ **Print stage banner:**
435
+ ```
436
+ +------------------------------------------+
437
+ | STAGE 5: Executor |
438
+ | Generating {total_files} test files |
439
+ | Status: Running... |
440
+ +------------------------------------------+
441
+ ```
442
+
443
+ **FLOW-04: Wave-based parallel execution:**
444
+
445
+ Check if planner created multiple independent feature groups. If `feature_count > 1` AND `parallelization` config allows parallel execution:
446
+
447
+ **Parallel execution (when feature_count > 1 and parallelization enabled):**
448
+
449
+ For each independent feature group from the generation plan, spawn a separate executor agent:
450
+ ```
451
+ Task(
452
+ prompt="
453
+ <objective>Generate test files for {feature} feature</objective>
454
+ <execution_context>@agents/qaa-executor.md</execution_context>
455
+ <files_to_read>
456
+ - {output_dir}/GENERATION_PLAN.md
457
+ - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
458
+ - CLAUDE.md
459
+ </files_to_read>
460
+ <parameters>
461
+ workflow_option: {option}
462
+ feature_group: {feature}
463
+ dev_repo_path: {dev_repo_path}
464
+ qa_repo_path: {qa_repo_path or null}
465
+ output_path: {output_dir}/
466
+ </parameters>
467
+ "
468
+ )
469
+ ```
470
+
471
+ Multiple Task() calls can be issued simultaneously for independent feature groups. Each executor handles one feature group and commits its files independently.
472
+
473
+ **Sequential execution (when feature_count == 1 or parallelization disabled):**
474
+
475
+ Spawn a single executor agent covering all tasks:
476
+ ```
477
+ Task(
478
+ prompt="
479
+ <objective>Generate all test files from generation plan</objective>
480
+ <execution_context>@agents/qaa-executor.md</execution_context>
481
+ <files_to_read>
482
+ - {output_dir}/GENERATION_PLAN.md
483
+ - {output_dir}/TEST_INVENTORY.md (Option 1) or {output_dir}/GAP_ANALYSIS.md (Options 2/3)
484
+ - CLAUDE.md
485
+ </files_to_read>
486
+ <parameters>
487
+ workflow_option: {option}
488
+ dev_repo_path: {dev_repo_path}
489
+ qa_repo_path: {qa_repo_path or null}
490
+ output_path: {output_dir}/
491
+ </parameters>
492
+ "
493
+ )
494
+ ```
495
+
496
+ **Option 3 specific -- skip existing tests:**
497
+
498
+ For Option 3, pass `skip_existing_test_ids: true` to the executor so it checks existing test files by test ID before generating. If a test ID already exists in the QA repo, skip generating that test case.
499
+
500
+ ```
501
+ <parameters>
502
+ workflow_option: 3
503
+ skip_existing_test_ids: true
504
+ dev_repo_path: {dev_repo_path}
505
+ qa_repo_path: {qa_repo_path}
506
+ output_path: {output_dir}/
507
+ </parameters>
508
+ ```
509
+
510
+ **Parse executor return:**
511
+
512
+ Expected return structure:
513
+ ```
514
+ EXECUTOR_COMPLETE:
515
+ files_created: [{path, type}, ...]
516
+ total_files: N
517
+ commit_count: N
518
+ features_covered: [...]
519
+ test_case_count: N
520
+ ```
521
+
522
+ Capture `files_created`, `total_files`, `commit_count` for validation stage and pipeline summary.
523
+
524
+ **State update -- mark generate as complete:**
525
+ ```bash
526
+ node bin/qaa-tools.cjs state patch --"Generate Status" complete --"Status" "Test generation complete"
527
+ ```
528
+
529
+ Print: "Generation complete. {total_files} files created across {features_covered.length} features. {commit_count} commits."
530
+ </step>
531
+
532
+ <step name="execute_validate">
533
+ ## Step 8: Execute Validate Stage
534
+
535
+ **State update -- mark validate as running:**
536
+ ```bash
537
+ node bin/qaa-tools.cjs state patch --"Validate Status" running --"Status" "Validating generated tests"
538
+ ```
539
+
540
+ **Print stage banner:**
541
+ ```
542
+ +------------------------------------------+
543
+ | STAGE 6: Validator |
544
+ | Validating {total_files} test files |
545
+ | Status: Running... |
546
+ +------------------------------------------+
547
+ ```
548
+
549
+ **Spawn validator agent via Task():**
550
+ ```
551
+ Task(
552
+ prompt="
553
+ <objective>Run 4-layer validation on all generated test files</objective>
554
+ <execution_context>@agents/qaa-validator.md</execution_context>
555
+ <files_to_read>
556
+ - {list all generated test files from executor return -- files_created paths}
557
+ - {output_dir}/GENERATION_PLAN.md
558
+ - CLAUDE.md
559
+ </files_to_read>
560
+ <parameters>
561
+ mode: validation
562
+ output_path: {output_dir}/VALIDATION_REPORT.md
563
+ </parameters>
564
+ "
565
+ )
566
+ ```
567
+
568
+ **Parse validator return:**
569
+
570
+ Expected return structure:
571
+ ```
572
+ VALIDATOR_COMPLETE:
573
+ report_path: "..."
574
+ overall_status: PASS | PASS_WITH_WARNINGS | FAIL
575
+ confidence: HIGH | MEDIUM | LOW
576
+ layers_summary: {syntax, structure, dependencies, logic}
577
+ fix_loops_used: N
578
+ issues_found: N
579
+ issues_fixed: N
580
+ unresolved_count: N
581
+ ```
582
+
583
+ **RISKY CHECKPOINT -- Validator escalation (FLOW-07):**
584
+
585
+ If `unresolved_count > 0` after max fix loops (3):
586
+ - **ALWAYS pause, even in auto mode** (this is a locked decision from CONTEXT.md)
587
+ - Present unresolved issues to user with full details from VALIDATION_REPORT.md
588
+ - Wait for user decision:
589
+ - `"approve-with-warnings"`: Accept the validation with warnings. Set Validate Status to complete. Continue to deliver.
590
+ - `"abort"`: Set Validate Status to failed. STOP PIPELINE ENTIRELY.
591
+ - Manual guidance: User provides specific fix instructions. Spawn fresh continuation agent to apply fixes and re-validate.
592
+
593
+ If `overall_status` is `PASS` or `PASS_WITH_WARNINGS` (and unresolved_count is 0):
594
+ ```bash
595
+ node bin/qaa-tools.cjs state patch --"Validate Status" complete --"Status" "Validation passed"
596
+ ```
597
+
598
+ Print: "Validation complete. Status: {overall_status}. Confidence: {confidence}. {issues_found} issues found, {issues_fixed} fixed, {unresolved_count} unresolved."
599
+ </step>
600
+
601
+ <step name="execute_bug_detective">
602
+ ## Step 9: Execute Bug Detective Stage (Conditional)
603
+
604
+ **Condition:** Only execute if the validator reports test failures. Check:
605
+ - `overall_status === 'FAIL'` in validator return, OR
606
+ - Generated tests have runtime failures that need classification
607
+
608
+ If the validator reports `PASS` or `PASS_WITH_WARNINGS` and there are no test execution failures, skip this stage entirely.
609
+
610
+ **If no failures to classify:**
611
+ Print: "Skipping Bug Detective (no test failures detected)." and proceed directly to Step 10 (Deliver).
612
+
613
+ **If failures need classification:**
614
+
615
+ **State update:**
616
+ ```bash
617
+ node bin/qaa-tools.cjs state patch --"Status" "Classifying test failures"
618
+ ```
619
+
620
+ **Print stage banner:**
621
+ ```
622
+ +------------------------------------------+
623
+ | STAGE 7: Bug Detective |
624
+ | Status: Running... |
625
+ +------------------------------------------+
626
+ ```
627
+
628
+ **Spawn bug-detective agent via Task():**
629
+ ```
630
+ Task(
631
+ prompt="
632
+ <objective>Classify test failures and attempt auto-fixes for test errors</objective>
633
+ <execution_context>@agents/qaa-bug-detective.md</execution_context>
634
+ <files_to_read>
635
+ - {test execution results -- from validator or direct test run}
636
+ - {failing test source files -- paths from executor return}
637
+ - CLAUDE.md
638
+ </files_to_read>
639
+ <parameters>
640
+ output_path: {output_dir}/FAILURE_CLASSIFICATION_REPORT.md
641
+ </parameters>
642
+ "
643
+ )
644
+ ```
645
+
646
+ **Parse bug-detective return:**
647
+
648
+ Expected return structure:
649
+ ```
650
+ DETECTIVE_COMPLETE:
651
+ report_path: "..."
652
+ total_failures: N
653
+ classification_breakdown: {app_bug: N, test_error: N, env_issue: N, inconclusive: N}
654
+ auto_fixes_applied: N
655
+ auto_fixes_verified: N
656
+ commit_hash: "..."
657
+ ```
658
+
659
+ **RISKY CHECKPOINT -- Application bugs detected:**
660
+
661
+ If `classification_breakdown.app_bug > 0`:
662
+ - **ALWAYS pause, even in auto mode** (locked decision -- application bugs require developer action)
663
+ - Present APPLICATION BUG classifications to user with full evidence from FAILURE_CLASSIFICATION_REPORT.md
664
+ - These are genuine bugs in the application code discovered during test execution
665
+ - The bug detective never touches application code -- it only reports
666
+ - User must review and decide how to proceed:
667
+ - Acknowledge bugs and continue pipeline (bugs will be in the PR description for developer attention)
668
+ - Abort pipeline to fix bugs first
669
+
670
+ Print: "Bug Detective complete. {total_failures} failures classified: {app_bug} APP BUG, {test_error} TEST ERROR, {env_issue} ENV ISSUE, {inconclusive} INCONCLUSIVE. {auto_fixes_applied} auto-fixes applied."
671
+ </step>
672
+
673
+ <step name="execute_deliver">
674
+ ## Step 10: Execute Deliver Stage
675
+
676
+ **State update -- mark deliver as running:**
677
+ ```bash
678
+ node bin/qaa-tools.cjs state patch --"Deliver Status" running --"Status" "Preparing delivery"
679
+ ```
680
+
681
+ **Print stage banner:**
682
+ ```
683
+ +------------------------------------------+
684
+ | STAGE 8: Deliver |
685
+ | Status: Running... |
686
+ +------------------------------------------+
687
+ ```
688
+
689
+ ### Sub-step 1: Pre-flight checks
690
+
691
+ Before attempting branch creation or PR, verify that the required tools are available and authenticated.
692
+
693
+ **Check for git remote:**
694
+ ```bash
695
+ REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "")
696
+ ```
697
+
698
+ If `REMOTE_URL` is empty:
699
+ - Print error: `"No git remote found. Artifacts committed locally but PR creation skipped."`
700
+ - Set `LOCAL_ONLY=true`
701
+ - Skip Sub-steps 6, 7, 8, 9 (push and PR creation). Still execute Sub-steps 2-5 (branch creation and commits) so artifacts are organized on a local branch.
702
+
703
+ **Check for gh CLI authentication:**
704
+ ```bash
705
+ gh auth status 2>/dev/null
706
+ ```
707
+
708
+ If `gh auth status` fails (non-zero exit code):
709
+ - Print error: `"gh CLI not authenticated. Run 'gh auth login' first. Artifacts committed locally."`
710
+ - Set `LOCAL_ONLY=true`
711
+ - Same skip behavior as above.
712
+
713
+ If both checks pass, set `LOCAL_ONLY=false`.
714
+
715
+ ### Sub-step 2: Derive project name
716
+
717
+ Read `dev_repo_path` from init context (available from Step 1).
718
+
719
+ **Attempt to read from package.json:**
720
+ ```bash
721
+ PROJECT_NAME=$(node -e "try { const p = require('${dev_repo_path}/package.json'); console.log(p.name || ''); } catch { console.log(''); }")
722
+ ```
723
+
724
+ **Fallback to directory basename:**
725
+ ```bash
726
+ if [ -z "$PROJECT_NAME" ]; then
727
+ PROJECT_NAME=$(basename "${dev_repo_path}")
728
+ fi
729
+ ```
730
+
731
+ **Sanitize for branch naming (lowercase, alphanumeric and hyphens only):**
732
+ ```bash
733
+ PROJECT_NAME=$(echo "$PROJECT_NAME" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//' | sed 's/-$//')
734
+ ```
735
+
736
+ ### Sub-step 3: Detect default branch
737
+
738
+ ```bash
739
+ DEFAULT_BRANCH=$(gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name' 2>/dev/null || echo "main")
740
+ ```
741
+
742
+ If `gh repo view` fails (e.g., no remote or gh not authenticated), fall back to `"main"`.
743
+
744
+ ### Sub-step 4: Create feature branch
745
+
746
+ **Branch name:** `qa/auto-{PROJECT_NAME}-{date}` where `date` comes from init context.
747
+
748
+ **Handle collision:** If branch already exists locally or remotely, append a numeric suffix:
749
+ ```bash
750
+ BRANCH="qa/auto-${PROJECT_NAME}-${date}"
751
+
752
+ # Check if branch exists locally or remotely
753
+ if git rev-parse --verify "$BRANCH" 2>/dev/null || git rev-parse --verify "origin/$BRANCH" 2>/dev/null; then
754
+ SUFFIX=2
755
+ while git rev-parse --verify "${BRANCH}-${SUFFIX}" 2>/dev/null || git rev-parse --verify "origin/${BRANCH}-${SUFFIX}" 2>/dev/null; do
756
+ SUFFIX=$((SUFFIX + 1))
757
+ done
758
+ BRANCH="${BRANCH}-${SUFFIX}"
759
+ fi
760
+ ```
761
+
762
+ **Create from default branch:**
763
+ ```bash
764
+ git checkout -b "$BRANCH" "$DEFAULT_BRANCH"
765
+ ```
766
+
767
+ If branch creation fails, print error and fall through to local-only commit on current branch.
768
+
769
+ ### Sub-step 5: Per-stage atomic commits
770
+
771
+ For each pipeline stage that produced artifacts, commit using `qaa-tools.cjs commit`. Check file existence before each commit -- skip stages that did not produce artifacts.
772
+
773
+ **Scanner:**
774
+ ```bash
775
+ if [ -f "${output_dir}/SCAN_MANIFEST.md" ]; then
776
+ node bin/qaa-tools.cjs commit "qa(scanner): produce SCAN_MANIFEST.md for ${PROJECT_NAME}" --files ${output_dir}/SCAN_MANIFEST.md
777
+ fi
778
+ ```
779
+
780
+ **Analyzer (Option 1):**
781
+ ```bash
782
+ if [ -f "${output_dir}/QA_ANALYSIS.md" ]; then
783
+ ANALYZER_FILES="${output_dir}/QA_ANALYSIS.md ${output_dir}/TEST_INVENTORY.md"
784
+ if [ -f "${output_dir}/QA_REPO_BLUEPRINT.md" ]; then
785
+ ANALYZER_FILES="${ANALYZER_FILES} ${output_dir}/QA_REPO_BLUEPRINT.md"
786
+ fi
787
+ node bin/qaa-tools.cjs commit "qa(analyzer): produce QA_ANALYSIS.md and TEST_INVENTORY.md" --files ${ANALYZER_FILES}
788
+ fi
789
+ ```
790
+
791
+ **Analyzer (Option 2/3):**
792
+ ```bash
793
+ if [ -f "${output_dir}/GAP_ANALYSIS.md" ]; then
794
+ node bin/qaa-tools.cjs commit "qa(analyzer): produce GAP_ANALYSIS.md" --files ${output_dir}/GAP_ANALYSIS.md
795
+ fi
796
+ ```
797
+
798
+ **TestID Injector (if ran):**
799
+ ```bash
800
+ if [ -f "${output_dir}/TESTID_AUDIT_REPORT.md" ]; then
801
+ node bin/qaa-tools.cjs commit "qa(testid-injector): inject ${injected_count} data-testid attributes across ${component_count} components" --files ${output_dir}/TESTID_AUDIT_REPORT.md ${modified_source_files}
802
+ fi
803
+ ```
804
+
805
+ Where `injected_count` and `component_count` are captured from the testid-injector return in Step 5, and `modified_source_files` are the frontend source files that were modified.
806
+
807
+ **Executor:**
808
+ ```bash
809
+ if [ -n "${generated_file_paths}" ]; then
810
+ node bin/qaa-tools.cjs commit "qa(executor): generate ${total_files} test files with POMs and fixtures" --files ${generated_file_paths}
811
+ fi
812
+ ```
813
+
814
+ Where `total_files` and `generated_file_paths` are captured from the executor return in Step 7.
815
+
816
+ **Validator:**
817
+ ```bash
818
+ if [ -f "${output_dir}/VALIDATION_REPORT.md" ]; then
819
+ node bin/qaa-tools.cjs commit "qa(validator): validate generated tests - ${overall_status} with ${confidence} confidence" --files ${output_dir}/VALIDATION_REPORT.md
820
+ fi
821
+ ```
822
+
823
+ Where `overall_status` and `confidence` are captured from the validator return in Step 8.
824
+
825
+ **Bug Detective (if ran):**
826
+ ```bash
827
+ if [ -f "${output_dir}/FAILURE_CLASSIFICATION_REPORT.md" ]; then
828
+ node bin/qaa-tools.cjs commit "qa(bug-detective): classify ${total_failures} failures - ${classification_summary}" --files ${output_dir}/FAILURE_CLASSIFICATION_REPORT.md
829
+ fi
830
+ ```
831
+
832
+ Where `total_failures` and `classification_summary` (e.g., "2 APP BUG, 2 TEST ERROR, 1 ENV ISSUE") are captured from the bug-detective return in Step 9.
833
+
834
+ ### Sub-step 6: Push branch
835
+
836
+ If `LOCAL_ONLY` is true, skip this sub-step.
837
+
838
+ ```bash
839
+ git push -u origin "$BRANCH"
840
+ ```
841
+
842
+ If push fails:
843
+ - Print error: `"Push failed: {error_message}. Artifacts committed locally on branch ${BRANCH}."`
844
+ - Set `LOCAL_ONLY=true` (skip PR creation but keep local commits).
845
+ - Do NOT set deliver status to failed -- artifacts are committed locally.
846
+
847
+ ### Sub-step 7: Build PR body
848
+
849
+ If `LOCAL_ONLY` is true, skip this sub-step.
850
+
851
+ **Read the PR template:**
852
+ ```bash
853
+ PR_BODY=$(cat templates/pr-template.md)
854
+ ```
855
+
856
+ **Replace all `{placeholder}` tokens with actual values collected during pipeline execution:**
857
+
858
+ - `{architecture_type}` -- from QA_ANALYSIS.md architecture overview section (or init context if available). Example: "Next.js full-stack application"
859
+ - `{framework}` -- from QA_ANALYSIS.md or SCAN_MANIFEST.md framework detection. Example: "Playwright"
860
+ - `{risk_summary}` -- from QA_ANALYSIS.md risk assessment counts. Example: "3 HIGH, 5 MEDIUM, 2 LOW"
861
+ - `{unit_count}` -- from `pyramid_breakdown.unit` captured in Step 4 (analyzer return)
862
+ - `{integration_count}` -- from `pyramid_breakdown.integration`
863
+ - `{api_count}` -- from `pyramid_breakdown.api`
864
+ - `{e2e_count}` -- from `pyramid_breakdown.e2e`
865
+ - `{total_count}` -- from `total_test_count` captured in Step 4
866
+ - `{modules_covered}` -- count of modules/files with at least one test case in TEST_INVENTORY.md
867
+ - `{coverage_estimate}` -- estimated coverage percentage from QA_ANALYSIS.md recommended testing pyramid section
868
+ - `{validation_result}` -- from `overall_status` captured in Step 8 (validator return). PASS, PASS_WITH_WARNINGS, or FAIL
869
+ - `{confidence}` -- from `confidence` captured in Step 8. HIGH, MEDIUM, or LOW
870
+ - `{fix_loops_used}` -- from `fix_loops_used` captured in Step 8. Number 0-3
871
+ - `{issues_found}` -- from `issues_found` captured in Step 8
872
+ - `{issues_fixed}` -- from `issues_fixed` captured in Step 8
873
+ - `{file_list}` -- if total generated files <= 50, list each file as `- {path}`. If > 50, use summary: `{N} files across {M} directories`
874
+
875
+ All replacements use simple string substitution. The PR body must be in English.
876
+
877
+ ### Sub-step 8: Create draft PR
878
+
879
+ If `LOCAL_ONLY` is true, skip this sub-step.
880
+
881
+ ```bash
882
+ PR_URL=$(gh pr create \
883
+ --draft \
884
+ --title "qa: automated test suite for ${PROJECT_NAME}" \
885
+ --body "${PR_BODY}" \
886
+ --label "qa-automation" \
887
+ --label "auto-generated" \
888
+ --assignee "@me" 2>&1)
889
+ ```
890
+
891
+ **Do NOT pass `--base` flag.** Let gh auto-detect the default branch to avoid errors when the default branch is not named "main".
892
+
893
+ **On success:** Capture the PR URL from stdout. The URL is the last line of `gh pr create` output.
894
+
895
+ **On failure:**
896
+ - Print error: `"PR creation failed: ${PR_URL}. Artifacts remain on branch ${BRANCH}."`
897
+ - Set deliver status to failed:
898
+ ```bash
899
+ node bin/qaa-tools.cjs state patch --"Deliver Status" failed --"Status" "Deliver failed: PR creation error"
900
+ ```
901
+ - Do NOT stop the pipeline -- artifacts are committed and pushed. The QA engineer can create the PR manually.
902
+
903
+ ### Sub-step 9: Print PR URL
904
+
905
+ If PR was created successfully:
906
+ ```
907
+ PR created: ${PR_URL}
908
+ ```
909
+
910
+ If `LOCAL_ONLY` is true:
911
+ ```
912
+ PR: not created (local-only mode). Artifacts committed on branch: ${BRANCH}
913
+ ```
914
+
915
+ Store `PR_URL` (or the local-only message) for inclusion in the pipeline summary banner.
916
+
917
+ **State update -- mark deliver as complete (on success):**
918
+ ```bash
919
+ node bin/qaa-tools.cjs state patch --"Deliver Status" complete --"Status" "Pipeline complete"
920
+ ```
921
+
922
+ **Clear auto-chain flag at pipeline completion:**
923
+ ```bash
924
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
925
+ ```
926
+ </step>
927
+
928
+ </process>
929
+
930
+ <auto_advance>
931
+ ## Auto-Advance Mode
932
+
933
+ Auto-advance is enabled when ANY of these is true:
934
+ - config.json `workflow.auto_advance = true` (persistent user preference)
935
+ - `--auto` flag passed to orchestrator invocation (per-run override)
936
+ - `workflow._auto_chain_active = true` in config (ephemeral chain flag from ongoing auto run)
937
+
938
+ ### Behavior in Auto Mode
939
+
940
+ **Safe checkpoints are auto-approved.** The pipeline continues without pausing. A log message records the auto-approval:
941
+ ```
942
+ Auto-approved: {checkpoint_description}
943
+ ```
944
+
945
+ **Risky checkpoints ALWAYS pause.** Even in auto mode, the pipeline stops and presents the checkpoint to the user. This is a locked decision -- unresolved validation issues and application bugs require human judgment.
946
+
947
+ **Full progress banners shown** even in auto mode -- user sees pipeline flowing in terminal with stage banners, agent spawning indicators, and completion messages. Auto mode does not suppress output.
948
+
949
+ **On stage failure in auto mode:** STOP PIPELINE ENTIRELY. Report which stage failed and why. No partial PR. User must intervene.
950
+
951
+ ### Safe vs Risky Checkpoint Classification
952
+
953
+ See the `<checkpoint_system>` section below for the complete classification table and handling flow.
954
+
955
+ ### Stale Chain Flag Protection
956
+
957
+ At orchestrator init, if `--auto` was NOT passed:
958
+ ```bash
959
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
960
+ ```
961
+ This prevents a previous interrupted `--auto` run from causing unexpected auto-advance in a new manual session.
962
+
963
+ ### Auto Mode Persistence
964
+
965
+ When `--auto` is passed:
966
+ ```bash
967
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active true
968
+ ```
969
+ This flag persists across agent spawns within the same pipeline run. Each spawned agent can check it to maintain auto-advance behavior through the chain.
970
+
971
+ At pipeline completion (success or failure), clear the chain flag:
972
+ ```bash
973
+ node bin/qaa-tools.cjs config-set workflow._auto_chain_active false
974
+ ```
975
+ </auto_advance>
976
+
977
+ <checkpoint_system>
978
+ ## Checkpoint System
979
+
980
+ ### Checkpoint Classification
981
+
982
+ Every agent may return checkpoint data when it encounters a situation requiring human input. The orchestrator classifies each checkpoint as SAFE or RISKY and handles it accordingly.
983
+
984
+ **SAFE checkpoints (auto-approve in auto mode):**
985
+
986
+ | Checkpoint | Agent | Why Safe | Auto-Action |
987
+ |------------|-------|----------|-------------|
988
+ | Framework detection uncertain (LOW confidence) | Scanner | Auto-select most likely framework; analysis can continue with reasonable default | Approve with most likely framework |
989
+ | Analyzer assumptions review | Analyzer | Assumptions are informational; incorrect assumptions produce suboptimal but not broken output | Approve all assumptions |
990
+ | TestID audit review | TestID Injector | P0-only injection is conservative; only forms, buttons, and primary actions receive test IDs | Approve P0-only injection |
991
+
992
+ **RISKY checkpoints (ALWAYS pause, even in auto mode):**
993
+
994
+ | Checkpoint | Agent | Why Risky | User Action Required |
995
+ |------------|-------|-----------|---------------------|
996
+ | Validator escalation (unresolved issues after 3 fix loops) | Validator | Unresolved issues mean tests may be broken; delivering broken tests defeats the purpose | User decides: approve-with-warnings, abort, or provide fix guidance |
997
+ | APPLICATION BUG classification | Bug Detective | Genuine bugs in application code require developer action, not auto-fix | User reviews bug evidence and decides whether to continue or fix first |
998
+ | Any checkpoint with `blocking` containing "unresolved" or "failed" | Any agent | Indicates pipeline integrity risk; proceeding could produce incorrect artifacts | User reviews the specific blocking issue |
999
+
1000
+ ### Checkpoint Handling Flow
1001
+
1002
+ ```
1003
+ On agent return with checkpoint data:
1004
+ 1. Extract checkpoint `blocking` field content
1005
+ 2. Classify as SAFE or RISKY:
1006
+ - Match against safe patterns:
1007
+ "framework detection" -> SAFE
1008
+ "assumptions" -> SAFE
1009
+ "audit" or "data-testid" -> SAFE
1010
+ - Match against risky patterns:
1011
+ "unresolved" -> RISKY
1012
+ "failed" -> RISKY
1013
+ "APPLICATION BUG" -> RISKY
1014
+ - Default (no pattern match) -> RISKY (conservative)
1015
+ 3. If IS_AUTO and SAFE:
1016
+ - Auto-approve with default action
1017
+ - Log: "Auto-approved: {checkpoint_description}"
1018
+ - Continue pipeline to next stage
1019
+ 4. If IS_AUTO and RISKY:
1020
+ - PAUSE pipeline
1021
+ - Print checkpoint details with full context:
1022
+ - What stage triggered the checkpoint
1023
+ - What was completed so far
1024
+ - The specific blocking issue
1025
+ - What artifacts have been produced
1026
+ - Wait for user input
1027
+ - On user response: spawn fresh continuation agent
1028
+ 5. If NOT auto (manual mode):
1029
+ - PAUSE pipeline
1030
+ - Print checkpoint details with full context
1031
+ - Wait for user input
1032
+ - On user response: spawn fresh continuation agent
1033
+ ```
1034
+
1035
+ ### Resume After Checkpoint
1036
+
1037
+ When resuming after a checkpoint, spawn a FRESH agent (not serialized state). This follows the GSD pattern: fresh agent with explicit state is more reliable than serialized continuation.
1038
+
1039
+ ```
1040
+ Task(
1041
+ prompt="
1042
+ <objective>Continue QA pipeline from {stage} stage</objective>
1043
+ <execution_context>@agents/qa-pipeline-orchestrator.md</execution_context>
1044
+ <resume_context>
1045
+ Pipeline state:
1046
+ - Completed stages: {list of completed stages with their results}
1047
+ - Current stage: {stage that triggered checkpoint}
1048
+ - Checkpoint response: {user's response or decision}
1049
+ - Artifacts produced so far: {list of files with paths}
1050
+
1051
+ Resume from: {exact step in pipeline to resume from}
1052
+ User decision: {what user chose at checkpoint}
1053
+ </resume_context>
1054
+ "
1055
+ )
1056
+ ```
1057
+
1058
+ The continuation agent reads this resume_context, verifies the completed stages by checking artifact existence on disk, and continues from the specified point. It does NOT re-execute completed stages.
1059
+
1060
+ ### Checkpoint Return Structure
1061
+
1062
+ Agents return checkpoints in this structure:
1063
+ ```
1064
+ CHECKPOINT_RETURN:
1065
+ completed: "What has been done so far"
1066
+ blocking: "What is blocking progress"
1067
+ details: "Detailed context about the blocking issue"
1068
+ awaiting: "What the user needs to do or provide"
1069
+ ```
1070
+
1071
+ The orchestrator parses the `blocking` field to classify the checkpoint.
1072
+ </checkpoint_system>
1073
+
1074
+ <error_handling>
1075
+ ## Error Handling
1076
+
1077
+ ### Stage Failure Protocol
1078
+
1079
+ When any agent returns a failure or error:
1080
+
1081
+ 1. **Set stage status to `failed`:**
1082
+ ```bash
1083
+ node bin/qaa-tools.cjs state patch --"{Stage} Status" failed --"Status" "Pipeline stopped: {Stage} failed - {reason}"
1084
+ ```
1085
+
1086
+ 2. **Print failure banner:**
1087
+ ```
1088
+ !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1089
+ ! PIPELINE STOPPED !
1090
+ ! Stage: {stage_name} !
1091
+ ! Reason: {failure_reason} !
1092
+ ! !
1093
+ ! Completed: {completed_stages} !
1094
+ ! Artifacts: {artifacts_so_far} !
1095
+ ! !
1096
+ ! Action required: Review and re-run !
1097
+ !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
1098
+ ```
1099
+
1100
+ 3. **DO NOT continue to next stage.** The pipeline stops entirely at the failed stage.
1101
+
1102
+ 4. **DO NOT create partial PR.** No branch, no commit, no PR with incomplete results.
1103
+
1104
+ 5. **Preserve all artifacts produced so far.** They may be useful for debugging the failure. Artifacts from completed stages remain on disk in `{output_dir}/`.
1105
+
1106
+ ### Agent Return Validation
1107
+
1108
+ After EVERY agent spawn, before advancing to next stage:
1109
+
1110
+ 1. **Check the return for error/stop conditions:**
1111
+ - Scanner: Check `decision` field -- if `STOP`, pipeline stops
1112
+ - Validator: Check `overall_status` -- if `FAIL` with unresolved issues, checkpoint triggers
1113
+ - Bug Detective: Check `classification_breakdown.app_bug` -- if > 0, checkpoint triggers
1114
+ - Any agent: Check for error messages, empty returns, or missing expected fields
1115
+
1116
+ 2. **Verify expected output artifacts exist on disk:**
1117
+ ```bash
1118
+ [ -f "{expected_artifact_path}" ] && echo "OK" || echo "MISSING"
1119
+ ```
1120
+ - Scanner: `{output_dir}/SCAN_MANIFEST.md` must exist
1121
+ - Analyzer: `{output_dir}/QA_ANALYSIS.md` (Option 1) or `{output_dir}/GAP_ANALYSIS.md` (Options 2/3) must exist
1122
+ - Planner: `{output_dir}/GENERATION_PLAN.md` must exist
1123
+ - Executor: All planned test files must exist
1124
+ - Validator: `{output_dir}/VALIDATION_REPORT.md` must exist
1125
+
1126
+ 3. **If artifacts missing:** Treat as stage failure. Set status to failed and stop pipeline.
1127
+
1128
+ ### Retry Policy
1129
+
1130
+ The orchestrator does NOT retry failed agents automatically. If a stage fails:
1131
+
1132
+ - **In auto mode:** Stop pipeline entirely and report the failure. Print which stage failed, what error occurred, and what artifacts were produced before failure.
1133
+ - **In manual mode:** Stop and present the failure to user. User can choose to:
1134
+ - Retry the failed stage (orchestrator spawns the same agent again)
1135
+ - Abort the pipeline
1136
+ - Provide guidance and retry with modifications
1137
+ </error_handling>
1138
+
1139
+ <pipeline_summary>
1140
+ ## Pipeline Summary
1141
+
1142
+ After all stages complete (or on pipeline stop), print a summary banner:
1143
+
1144
+ ```
1145
+ ======================================================
1146
+ QA PIPELINE COMPLETE
1147
+ ======================================================
1148
+
1149
+ Option: {option} ({option_description})
1150
+ Repository: {dev_repo_path}
1151
+ QA Repo: {qa_repo_path or 'N/A'}
1152
+ Maturity Score: {maturity_score or 'N/A'}
1153
+
1154
+ Stages Completed:
1155
+ [{check}] Scan -- {scan_duration} {scan_extra}
1156
+ [{check}] Analyze -- {analyze_duration} ({test_count} test cases)
1157
+ [{check}] TestID Inject-- {inject_duration or 'skipped'}
1158
+ [{check}] Plan -- {plan_duration} ({file_count} files planned)
1159
+ [{check}] Generate -- {generate_duration} ({files_created} files created)
1160
+ [{check}] Validate -- {validate_duration} ({confidence} confidence)
1161
+ [{check}] Bug Detective-- {detective_duration or 'skipped'}
1162
+ [{check}] Deliver -- {deliver_duration}
1163
+
1164
+ PR: {pr_url or 'not created (local-only)'}
1165
+
1166
+ Artifacts:
1167
+ {list all produced .md files in output_dir}
1168
+
1169
+ Total Time: {total_duration}
1170
+ ======================================================
1171
+ ```
1172
+
1173
+ Where:
1174
+ - `[x]` = stage completed successfully
1175
+ - `[ ]` = stage skipped (testid-inject when no frontend, bug-detective when no failures)
1176
+ - `[!]` = stage failed
1177
+
1178
+ **On pipeline failure:** The summary still prints, but shows which stages completed and which failed, along with the failure reason.
1179
+
1180
+ **Artifact list includes:**
1181
+ - SCAN_MANIFEST.md (always)
1182
+ - QA_ANALYSIS.md (Option 1) or GAP_ANALYSIS.md (Options 2/3)
1183
+ - TEST_INVENTORY.md (Option 1)
1184
+ - QA_REPO_BLUEPRINT.md (Option 1, if produced)
1185
+ - TESTID_AUDIT_REPORT.md (if frontend detected)
1186
+ - GENERATION_PLAN.md (if plan stage completed)
1187
+ - Generated test files (if generate stage completed)
1188
+ - VALIDATION_REPORT.md (if validate stage completed)
1189
+ - FAILURE_CLASSIFICATION_REPORT.md (if bug detective ran)
1190
+ </pipeline_summary>
1191
+
1192
+ <quality_gate>
1193
+ ## Quality Gate
1194
+
1195
+ Before this orchestrator is considered complete, verify:
1196
+
1197
+ - [ ] All 3 workflow options route to correct stage sequences:
1198
+ - Option 1: scan(dev) -> analyze(full) -> [testid-inject] -> plan -> generate -> validate -> [bug-detective] -> deliver
1199
+ - Option 2: scan(both) -> analyze(gap) -> [testid-inject] -> plan(gap) -> generate(gap) -> validate -> [bug-detective] -> deliver
1200
+ - Option 3: scan(both) -> analyze(gap) -> [testid-inject] -> plan(gap) -> generate(skip-existing) -> validate -> [bug-detective] -> deliver
1201
+ - [ ] Every agent spawn is bracketed by state updates (running before, complete/failed after)
1202
+ - [ ] Auto-advance correctly classifies safe vs risky checkpoints
1203
+ - [ ] Pipeline stops entirely on any stage failure (no partial PR)
1204
+ - [ ] Progress banners print for every stage even in auto mode
1205
+ - [ ] Deliver stage creates branch, commits per-stage, pushes, and creates draft PR via gh CLI
1206
+ - [ ] Resume spawns fresh agent with explicit state (no serialization)
1207
+ </quality_gate>
1208
+
1209
+ <success_criteria>
1210
+ ## Success Criteria
1211
+
1212
+ 1. QA engineer can invoke orchestrator and pipeline runs through all stages for their repo type
1213
+ 2. Option detection is automatic based on repo count and maturity scoring
1214
+ 3. Pipeline state in STATE.md accurately reflects progress at every point
1215
+ 4. Checkpoints pause when appropriate and auto-approve when safe
1216
+ 5. Failure in any stage stops the pipeline cleanly with actionable error message
1217
+ </success_criteria>