maestro-flow 0.3.46 → 0.3.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (241) hide show
  1. package/.claude/agents/ui-design-agent.md +1 -0
  2. package/.claude/agents/workflow-executor.md +3 -0
  3. package/.claude/commands/learn-decompose.md +91 -146
  4. package/.claude/commands/learn-follow.md +102 -137
  5. package/.claude/commands/learn-investigate.md +102 -167
  6. package/.claude/commands/learn-retro.md +100 -243
  7. package/.claude/commands/learn-second-opinion.md +95 -135
  8. package/.claude/commands/maestro-amend.md +95 -232
  9. package/.claude/commands/maestro-analyze.md +1 -6
  10. package/.claude/commands/maestro-collab.md +104 -265
  11. package/.claude/commands/maestro-composer.md +113 -293
  12. package/.claude/commands/maestro-execute.md +10 -17
  13. package/.claude/commands/maestro-impeccable.md +89 -0
  14. package/.claude/commands/maestro-plan.md +1 -6
  15. package/.claude/commands/maestro-player.md +111 -340
  16. package/.claude/commands/maestro-quick.md +9 -0
  17. package/.claude/commands/maestro-ralph-execute.md +167 -210
  18. package/.claude/commands/maestro-ralph.md +245 -426
  19. package/.claude/commands/maestro-ui-codify.md +13 -0
  20. package/.claude/commands/maestro-ui-craft.md +364 -0
  21. package/.claude/commands/maestro-ui-design.md +12 -1
  22. package/.claude/commands/maestro-verify.md +12 -13
  23. package/.claude/commands/maestro.md +142 -72
  24. package/.claude/commands/manage-knowhow-capture.md +45 -170
  25. package/.claude/commands/quality-auto-test.md +9 -0
  26. package/.claude/commands/quality-debug.md +11 -25
  27. package/.claude/commands/quality-refactor.md +9 -0
  28. package/.claude/commands/quality-review.md +5 -14
  29. package/.claude/commands/spec-add.md +1 -1
  30. package/.claude/commands/spec-load.md +3 -2
  31. package/.claude/skills/maestro-impeccable/SKILL.md +169 -0
  32. package/.codex/skills/learn-decompose/SKILL.md +1 -1
  33. package/.codex/skills/learn-investigate/SKILL.md +2 -1
  34. package/.codex/skills/maestro/SKILL.md +278 -313
  35. package/.codex/skills/maestro-analyze/SKILL.md +126 -417
  36. package/.codex/skills/maestro-brainstorm/SKILL.md +129 -451
  37. package/.codex/skills/maestro-collab/SKILL.md +134 -547
  38. package/.codex/skills/maestro-execute/SKILL.md +3 -1
  39. package/.codex/skills/maestro-impeccable/SKILL.md +112 -0
  40. package/.codex/skills/maestro-plan/SKILL.md +88 -437
  41. package/.codex/skills/maestro-player/SKILL.md +191 -333
  42. package/.codex/skills/maestro-quick/SKILL.md +2 -0
  43. package/.codex/skills/maestro-ralph/SKILL.md +307 -710
  44. package/.codex/skills/maestro-roadmap/SKILL.md +201 -518
  45. package/.codex/skills/maestro-ui-codify/SKILL.md +1 -0
  46. package/.codex/skills/maestro-ui-craft/SKILL.md +341 -0
  47. package/.codex/skills/maestro-ui-design/SKILL.md +10 -0
  48. package/.codex/skills/maestro-verify/SKILL.md +116 -409
  49. package/.codex/skills/quality-auto-test/SKILL.md +145 -443
  50. package/.codex/skills/quality-refactor/SKILL.md +1 -1
  51. package/.codex/skills/quality-test/SKILL.md +229 -517
  52. package/.codex/skills/spec-add/SKILL.md +1 -1
  53. package/README.md +4 -1
  54. package/README.zh-CN.md +3 -1
  55. package/dashboard/dist-server/dashboard/src/server/agents/codex-cli-adapter.js +3 -0
  56. package/dashboard/dist-server/dashboard/src/server/agents/codex-cli-adapter.js.map +1 -1
  57. package/dashboard/dist-server/dashboard/src/server/routes/install.js +110 -1
  58. package/dashboard/dist-server/dashboard/src/server/routes/install.js.map +1 -1
  59. package/dashboard/dist-server/dashboard/src/server/routes/settings.js +56 -0
  60. package/dashboard/dist-server/dashboard/src/server/routes/settings.js.map +1 -1
  61. package/dashboard/dist-server/dashboard/src/server/routes/wiki.js +2 -0
  62. package/dashboard/dist-server/dashboard/src/server/routes/wiki.js.map +1 -1
  63. package/dashboard/dist-server/dashboard/src/server/wiki/spec-entry-parser.js +2 -2
  64. package/dashboard/dist-server/dashboard/src/server/wiki/spec-entry-parser.js.map +1 -1
  65. package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.js +2 -0
  66. package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.js.map +1 -1
  67. package/dashboard/dist-server/dashboard/src/server/wiki/wiki-types.d.ts +3 -1
  68. package/dashboard/dist-server/dashboard/src/shared/constants.d.ts +2 -0
  69. package/dashboard/dist-server/dashboard/src/shared/constants.js +2 -0
  70. package/dashboard/dist-server/dashboard/src/shared/constants.js.map +1 -1
  71. package/dist/src/agents/cli-agent-runner.d.ts.map +1 -1
  72. package/dist/src/agents/cli-agent-runner.js +1 -3
  73. package/dist/src/agents/cli-agent-runner.js.map +1 -1
  74. package/dist/src/agents/cli-history-store.d.ts +5 -0
  75. package/dist/src/agents/cli-history-store.d.ts.map +1 -1
  76. package/dist/src/agents/cli-history-store.js +65 -13
  77. package/dist/src/agents/cli-history-store.js.map +1 -1
  78. package/dist/src/cli.js +13 -0
  79. package/dist/src/cli.js.map +1 -1
  80. package/dist/src/commands/command-help.d.ts +3 -0
  81. package/dist/src/commands/command-help.d.ts.map +1 -0
  82. package/dist/src/commands/command-help.js +60 -0
  83. package/dist/src/commands/command-help.js.map +1 -0
  84. package/dist/src/commands/config.d.ts.map +1 -1
  85. package/dist/src/commands/config.js +17 -0
  86. package/dist/src/commands/config.js.map +1 -1
  87. package/dist/src/commands/delegate.d.ts.map +1 -1
  88. package/dist/src/commands/delegate.js +12 -2
  89. package/dist/src/commands/delegate.js.map +1 -1
  90. package/dist/src/commands/impeccable.d.ts +10 -0
  91. package/dist/src/commands/impeccable.d.ts.map +1 -0
  92. package/dist/src/commands/impeccable.js +181 -0
  93. package/dist/src/commands/impeccable.js.map +1 -0
  94. package/dist/src/commands/spec.js +1 -1
  95. package/dist/src/commands/spec.js.map +1 -1
  96. package/dist/src/commands/wiki.d.ts.map +1 -1
  97. package/dist/src/commands/wiki.js +5 -1
  98. package/dist/src/commands/wiki.js.map +1 -1
  99. package/dist/src/config/cli-tools-config.d.ts.map +1 -1
  100. package/dist/src/config/cli-tools-config.js +10 -7
  101. package/dist/src/config/cli-tools-config.js.map +1 -1
  102. package/dist/src/core/addon-registry.d.ts +31 -0
  103. package/dist/src/core/addon-registry.d.ts.map +1 -0
  104. package/dist/src/core/addon-registry.js +28 -0
  105. package/dist/src/core/addon-registry.js.map +1 -0
  106. package/dist/src/hooks/plugins/spec-injection-plugin.js +2 -0
  107. package/dist/src/hooks/plugins/spec-injection-plugin.js.map +1 -1
  108. package/dist/src/hooks/spec-injector.js +2 -2
  109. package/dist/src/hooks/spec-injector.js.map +1 -1
  110. package/dist/src/index.d.ts +2 -0
  111. package/dist/src/index.d.ts.map +1 -1
  112. package/dist/src/index.js +1 -0
  113. package/dist/src/index.js.map +1 -1
  114. package/dist/src/tools/impeccable/critique-storage.d.ts +28 -0
  115. package/dist/src/tools/impeccable/critique-storage.d.ts.map +1 -0
  116. package/dist/src/tools/impeccable/critique-storage.js +120 -0
  117. package/dist/src/tools/impeccable/critique-storage.js.map +1 -0
  118. package/dist/src/tools/impeccable/design-parser.d.ts +90 -0
  119. package/dist/src/tools/impeccable/design-parser.d.ts.map +1 -0
  120. package/dist/src/tools/impeccable/design-parser.js +696 -0
  121. package/dist/src/tools/impeccable/design-parser.js.map +1 -0
  122. package/dist/src/tools/impeccable/detect-csp.d.ts +6 -0
  123. package/dist/src/tools/impeccable/detect-csp.d.ts.map +1 -0
  124. package/dist/src/tools/impeccable/detect-csp.js +130 -0
  125. package/dist/src/tools/impeccable/detect-csp.js.map +1 -0
  126. package/dist/src/tools/impeccable/is-generated.d.ts +4 -0
  127. package/dist/src/tools/impeccable/is-generated.d.ts.map +1 -0
  128. package/dist/src/tools/impeccable/is-generated.js +56 -0
  129. package/dist/src/tools/impeccable/is-generated.js.map +1 -0
  130. package/dist/src/tools/impeccable/live/accept.d.ts +50 -0
  131. package/dist/src/tools/impeccable/live/accept.d.ts.map +1 -0
  132. package/dist/src/tools/impeccable/live/accept.js +556 -0
  133. package/dist/src/tools/impeccable/live/accept.js.map +1 -0
  134. package/dist/src/tools/impeccable/live/bootstrap.d.ts +2 -0
  135. package/dist/src/tools/impeccable/live/bootstrap.d.ts.map +1 -0
  136. package/dist/src/tools/impeccable/live/bootstrap.js +244 -0
  137. package/dist/src/tools/impeccable/live/bootstrap.js.map +1 -0
  138. package/dist/src/tools/impeccable/live/complete.d.ts +7 -0
  139. package/dist/src/tools/impeccable/live/complete.d.ts.map +1 -0
  140. package/dist/src/tools/impeccable/live/complete.js +67 -0
  141. package/dist/src/tools/impeccable/live/complete.js.map +1 -0
  142. package/dist/src/tools/impeccable/live/completion.d.ts +24 -0
  143. package/dist/src/tools/impeccable/live/completion.d.ts.map +1 -0
  144. package/dist/src/tools/impeccable/live/completion.js +26 -0
  145. package/dist/src/tools/impeccable/live/completion.js.map +1 -0
  146. package/dist/src/tools/impeccable/live/inject.d.ts +41 -0
  147. package/dist/src/tools/impeccable/live/inject.d.ts.map +1 -0
  148. package/dist/src/tools/impeccable/live/inject.js +394 -0
  149. package/dist/src/tools/impeccable/live/inject.js.map +1 -0
  150. package/dist/src/tools/impeccable/live/poll.d.ts +24 -0
  151. package/dist/src/tools/impeccable/live/poll.d.ts.map +1 -0
  152. package/dist/src/tools/impeccable/live/poll.js +180 -0
  153. package/dist/src/tools/impeccable/live/poll.js.map +1 -0
  154. package/dist/src/tools/impeccable/live/resume.d.ts +5 -0
  155. package/dist/src/tools/impeccable/live/resume.d.ts.map +1 -0
  156. package/dist/src/tools/impeccable/live/resume.js +30 -0
  157. package/dist/src/tools/impeccable/live/resume.js.map +1 -0
  158. package/dist/src/tools/impeccable/live/server.d.ts +6 -0
  159. package/dist/src/tools/impeccable/live/server.d.ts.map +1 -0
  160. package/dist/src/tools/impeccable/live/server.js +867 -0
  161. package/dist/src/tools/impeccable/live/server.js.map +1 -0
  162. package/dist/src/tools/impeccable/live/session-store.d.ts +72 -0
  163. package/dist/src/tools/impeccable/live/session-store.d.ts.map +1 -0
  164. package/dist/src/tools/impeccable/live/session-store.js +281 -0
  165. package/dist/src/tools/impeccable/live/session-store.js.map +1 -0
  166. package/dist/src/tools/impeccable/live/static/live-browser-session.js +123 -0
  167. package/dist/src/tools/impeccable/live/static/live-browser.js +4860 -0
  168. package/dist/src/tools/impeccable/live/static/modern-screenshot.umd.js +14 -0
  169. package/dist/src/tools/impeccable/live/status.d.ts +2 -0
  170. package/dist/src/tools/impeccable/live/status.d.ts.map +1 -0
  171. package/dist/src/tools/impeccable/live/status.js +52 -0
  172. package/dist/src/tools/impeccable/live/status.js.map +1 -0
  173. package/dist/src/tools/impeccable/live/wrap.d.ts +33 -0
  174. package/dist/src/tools/impeccable/live/wrap.d.ts.map +1 -0
  175. package/dist/src/tools/impeccable/live/wrap.js +572 -0
  176. package/dist/src/tools/impeccable/live/wrap.js.map +1 -0
  177. package/dist/src/tools/impeccable/load-context.d.ts +13 -0
  178. package/dist/src/tools/impeccable/load-context.d.ts.map +1 -0
  179. package/dist/src/tools/impeccable/load-context.js +79 -0
  180. package/dist/src/tools/impeccable/load-context.js.map +1 -0
  181. package/dist/src/tools/impeccable/paths.d.ts +34 -0
  182. package/dist/src/tools/impeccable/paths.d.ts.map +1 -0
  183. package/dist/src/tools/impeccable/paths.js +102 -0
  184. package/dist/src/tools/impeccable/paths.js.map +1 -0
  185. package/dist/src/tools/spec-entry-parser.d.ts +1 -1
  186. package/dist/src/tools/spec-entry-parser.d.ts.map +1 -1
  187. package/dist/src/tools/spec-entry-parser.js +1 -1
  188. package/dist/src/tools/spec-entry-parser.js.map +1 -1
  189. package/dist/src/tools/spec-init.d.ts.map +1 -1
  190. package/dist/src/tools/spec-init.js +26 -1
  191. package/dist/src/tools/spec-init.js.map +1 -1
  192. package/dist/src/tools/spec-loader.d.ts +1 -1
  193. package/dist/src/tools/spec-loader.d.ts.map +1 -1
  194. package/dist/src/tools/spec-loader.js +2 -0
  195. package/dist/src/tools/spec-loader.js.map +1 -1
  196. package/package.json +2 -2
  197. package/workflows/claude-instructions.md +17 -5
  198. package/workflows/cli-tools-usage.md +10 -3
  199. package/workflows/delegate-usage.md +3 -2
  200. package/workflows/impeccable/adapt.md +190 -0
  201. package/workflows/impeccable/animate.md +175 -0
  202. package/workflows/impeccable/audit.md +133 -0
  203. package/workflows/impeccable/bolder.md +113 -0
  204. package/workflows/impeccable/brand.md +118 -0
  205. package/workflows/impeccable/clarify.md +174 -0
  206. package/workflows/impeccable/codex.md +105 -0
  207. package/workflows/impeccable/cognitive-load.md +106 -0
  208. package/workflows/impeccable/color-and-contrast.md +105 -0
  209. package/workflows/impeccable/colorize.md +154 -0
  210. package/workflows/impeccable/craft.md +123 -0
  211. package/workflows/impeccable/critique.md +261 -0
  212. package/workflows/impeccable/delight.md +302 -0
  213. package/workflows/impeccable/distill.md +111 -0
  214. package/workflows/impeccable/document.md +439 -0
  215. package/workflows/impeccable/extract.md +69 -0
  216. package/workflows/impeccable/harden.md +347 -0
  217. package/workflows/impeccable/heuristics-scoring.md +234 -0
  218. package/workflows/impeccable/interaction-design.md +195 -0
  219. package/workflows/impeccable/layout.md +141 -0
  220. package/workflows/impeccable/live.md +622 -0
  221. package/workflows/impeccable/motion-design.md +109 -0
  222. package/workflows/impeccable/onboard.md +234 -0
  223. package/workflows/impeccable/optimize.md +258 -0
  224. package/workflows/impeccable/overdrive.md +130 -0
  225. package/workflows/impeccable/personas.md +179 -0
  226. package/workflows/impeccable/polish.md +242 -0
  227. package/workflows/impeccable/product.md +62 -0
  228. package/workflows/impeccable/quieter.md +99 -0
  229. package/workflows/impeccable/responsive-design.md +114 -0
  230. package/workflows/impeccable/shape.md +165 -0
  231. package/workflows/impeccable/spatial-design.md +100 -0
  232. package/workflows/impeccable/teach.md +168 -0
  233. package/workflows/impeccable/typeset.md +124 -0
  234. package/workflows/impeccable/typography.md +159 -0
  235. package/workflows/impeccable/ux-writing.md +107 -0
  236. package/workflows/impeccable.md +164 -0
  237. package/workflows/maestro.md +7 -3
  238. package/workflows/skill-authoring.md +265 -0
  239. package/workflows/specs-add.md +3 -2
  240. package/workflows/specs-load.md +2 -1
  241. package/workflows/specs-setup.md +21 -1
@@ -1,517 +1,229 @@
1
- ---
2
- name: quality-test
3
- description: Conversational UAT with auto-diagnosis and gap closure
4
- argument-hint: "<phase> [-y] [--smoke] [--auto-fix] [--session ID]"
5
- allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
6
- ---
7
-
8
- <purpose>
9
- Conversational UAT: present expected behavior one test at a time, user confirms or describes issues. Severity inferred from natural language (never asked). Session persists in `uat.md` across context resets. Failed tests trigger CSV-parallel diagnosis via `spawn_agents_on_csv` and optional gap-fix closure.
10
-
11
- **Philosophy**: Show expected, ask if reality matches.
12
-
13
- ```
14
- +---------------------------------------------------------------------------+
15
- | UAT CSV DIAGNOSIS PIPELINE |
16
- +---------------------------------------------------------------------------+
17
- | |
18
- | Phase 1: Setup & Scenario Design |
19
- | +-- Resolve target (phase / scratch) |
20
- | +-- Check active sessions (resume or new) |
21
- | +-- Smoke tests (if --smoke) |
22
- | +-- Load verification context + quality artifacts |
23
- | +-- Design test scenarios from user-observable outcomes |
24
- | +-- Create uat.md with all tests pending |
25
- | |
26
- | Phase 2: Interactive Testing (one at a time) |
27
- | +-- Present test: show expected behavior |
28
- | +-- User responds: pass / skip / describe issue |
29
- | +-- Severity inferred (never asked) |
30
- | +-- Issues auto-created in issues.jsonl |
31
- | +-- Batched writes to uat.md |
32
- | |
33
- | Phase 3: Diagnosis (if issues found) |
34
- | +-- Cluster gaps by component/module |
35
- | +-- Build diagnosis.csv from gap clusters |
36
- | +-- Diagnose in parallel via spawn_agents_on_csv |
37
- | +-- Each agent: find root cause, fix direction, affected files |
38
- | +-- Merge results into uat.md gaps |
39
- | |
40
- | Phase 4: Gap Closure & Report |
41
- | +-- If --auto-fix: plan --gaps -> execute -> re-verify (max 2) |
42
- | +-- Otherwise: present options (auto-fix / debug / plan / manual) |
43
- | +-- Issue lifecycle sync throughout |
44
- | +-- Report with pass/fail counts and next steps |
45
- | |
46
- +---------------------------------------------------------------------------+
47
- ```
48
- </purpose>
49
-
50
- <context>
51
- ```bash
52
- $quality-test "3" # test phase 3
53
- $quality-test "3 --smoke" # smoke tests first, then UAT
54
- $quality-test "3 --auto-fix" # auto-trigger gap-fix loop on failures
55
- $quality-test "-y 3" # implies --auto-fix, skip gap closure prompt
56
- $quality-test "--session 04-comments" # resume specific session
57
- ```
58
-
59
- **Flags**:
60
- - `<phase>`: Phase number or scratch task ID
61
- - `--smoke`: Run cold-start smoke tests before UAT
62
- - `--auto-fix`: Auto-trigger gap-fix loop (plan --gaps -> execute -> re-verify) on failures
63
- - `--session ID`: Resume a specific UAT session
64
-
65
- `-y` implies `--auto-fix`. UAT itself remains interactive (present expected → user confirms). `-y` only automates the gap closure loop.
66
-
67
- **Output**:
68
- - `{target_dir}/uat.md` — session file (persistent)
69
- - `{target_dir}/.tests/test-plan.json` — scenario definitions
70
- - `{target_dir}/.tests/test-results.json` — pass/fail results
71
- - `{target_dir}/.tests/coverage-report.json` — requirement coverage
72
- - `.tests/.csv-session/diagnosis.csv` + `diagnosis-results.csv` — diagnosis artifacts
73
- </context>
74
-
75
- <csv_schema>
76
-
77
- ### diagnosis.csv (Gap Diagnosis Phase)
78
-
79
- ```csv
80
- id,test_id,cluster,test_name,expected,reported,severity,target_files,issue_id,source_context,root_cause,fix_direction,affected_files,evidence,error
81
- "DX-001","T-003","auth","Login validation","Valid login returns dashboard","Clicking login does nothing, no error","major","src/auth/login.ts;src/routes/auth.ts","ISS-20260503-001","login.ts calls authService.verify, auth.ts exports POST /login","","","","",""
82
- "DX-002","T-005","events","Event cleanup on logout","Events unsubscribed after logout","Memory leak warning in console after logout","blocker","src/events/manager.ts","ISS-20260503-002","manager.ts has subscribe() but no unsubscribe in logout flow","","","","",""
83
- ```
84
-
85
- **Columns**:
86
-
87
- | Column | Phase | Description |
88
- |--------|-------|-------------|
89
- | `id` | Input | Diagnosis ID (DX-NNN) |
90
- | `test_id` | Input | Reference to T-NNN test |
91
- | `cluster` | Input | Gap cluster name (component/area) |
92
- | `test_name` | Input | Human-readable test name |
93
- | `expected` | Input | Expected behavior from test scenario |
94
- | `reported` | Input | User's issue description (verbatim) |
95
- | `severity` | Input | Inferred severity (blocker/major/minor/cosmetic) |
96
- | `target_files` | Input | Semicolon-separated source files to investigate |
97
- | `issue_id` | Input | Back-reference to issues.jsonl entry |
98
- | `source_context` | Input | Relevant code context (imports, exports, call chains) |
99
- | `root_cause` | Output | Diagnosed root cause |
100
- | `fix_direction` | Output | Suggested fix approach |
101
- | `affected_files` | Output | Semicolon-separated files that need changes |
102
- | `evidence` | Output | file:line references supporting diagnosis |
103
- | `error` | Output | Agent error if diagnosis failed |
104
-
105
- ### Session Structure
106
-
107
- ```
108
- {target_dir}/.tests/.csv-session/
109
- +-- diagnosis.csv (diagnosis input)
110
- +-- diagnosis-results.csv (diagnosis output)
111
- ```
112
- </csv_schema>
113
-
114
- <invariants>
115
- 1. **One test at a time** — never batch-present tests
116
- 2. **Never ask severity** — always infer from natural language
117
- 3. **Session persistence** uat.md survives context resets, resume from any point
118
- 4. **Batched writes** — minimize file I/O (on issue, every 5 passes, completion)
119
- 5. **Gap-fix loop max 2 iterations** — prevent infinite loops
120
- 6. **CSV parallel diagnosis** — spawn_agents_on_csv for gap clusters, not sequential
121
- 7. **Auto-create issues** — every failed test creates entry in `.workflow/issues/issues.jsonl`
122
- 8. **Issue lifecycle sync** — track issues through registered → planning → executing → completed/failed
123
- </invariants>
124
-
125
- <execution>
126
-
127
- ### Step 1: Resolve Target
128
-
129
- 1. Parse `$ARGUMENTS` for phase number, scratch task ID, or flags
130
- 2. **Phase mode**: resolve `PHASE_DIR` via artifact registry in `state.json` (`type='execute'`, matching phase)
131
- 3. **Scratch mode**: set `SCRATCH_DIR = .workflow/scratch/{id}/`
132
- 4. Validate target exists and has `verification.json` if missing: **E002**
133
-
134
- ### Step 2: Check Active Sessions
135
-
136
- Scan `.workflow/scratch` for existing `uat.md` files with `status: testing` in frontmatter.
137
-
138
- - If active sessions exist and no target specified: display session table, ask user to resume or start new:
139
- ```
140
- ## Active UAT Sessions
141
- | # | Target | Status | Current Test | Progress |
142
- |---|--------|--------|--------------|----------|
143
- | 1 | 04-comments | testing | 3. Reply to Comment | 2/6 |
144
- Reply with a number to resume, or provide a phase/task to start new.
145
- ```
146
- - If `--session ID` specified: resume that session directly (skip to Step 9)
147
- - If session exists for target: offer resume or restart
148
-
149
- ### Step 3: Smoke Tests (if --smoke)
150
-
151
- Skip if `--smoke` not set.
152
-
153
- | Smoke Test | Check | Method |
154
- |------------|-------|--------|
155
- | App starts | Process runs without crash | Run start command, check exit code |
156
- | Routes respond | Key endpoints return non-error | curl/fetch main routes |
157
- | Build clean | No build errors | Build command succeeds |
158
- | Dependencies | No missing deps | Install check |
159
-
160
- Record in `uat.md` under `## Smoke Tests`. If any fails: **E003** — abort, suggest `$quality-debug`.
161
-
162
- ### Step 4: Load Verification Context
163
-
164
- Read from target directory: `verification.json`, `validation.json`, `index.json`, `plan.json`, `.summaries/TASK-*.md`. Build testable list from user-observable outcomes.
165
-
166
- ### Step 4.5: Load Test Tools (Knowhow Discovery)
167
-
168
- Load registered test tools to supplement verification-based scenarios:
169
-
170
- ```bash
171
- maestro spec load --category test --keyword <feature>
172
- ```
173
-
174
- If tools are found, extract their steps as additional test scenarios marked `source: "tool"`. Tool steps map to UAT scenarios: each numbered step becomes a test with its assertion as `expected` behavior. This enables tools registered via `/maestro-tools-register` to drive UAT verification.
175
-
176
- ### Step 4.6: Load Quality Context (Cross-Artifact Integration)
177
-
178
- Query `state.json.artifacts[]` for all artifacts matching current phase and milestone:
179
-
180
- **Review findings integration**:
181
- - For `type: "review"` artifacts: read `review.json`, extract critical/high findings
182
- - Generate additional test scenarios marked `source: "review_finding"`
183
- - If review verdict is "BLOCK" and review-finding tests fail → auto-enter gap-fix loop
184
-
185
- **Debug root cause integration**:
186
- - For `type: "debug"` artifacts: read `understanding.md`, extract confirmed root causes
187
- - Generate regression test scenarios marked `source: "debug_root_cause"`
188
-
189
- ### Step 5: Design Test Scenarios
190
-
191
- Create scenarios from testables:
192
- - `id`: T-001, T-002, ...
193
- - `name`: Brief test name
194
- - `category`: "e2e" | "integration" | "unit"
195
- - `expected`: Specific observable behavior
196
- - `requirement_ref`: Which success criterion this covers
197
- - `source`: "verification" | "review_finding" | "debug_root_cause"
198
-
199
- Write `{target_dir}/.tests/test-plan.json`:
200
- ```json
201
- {
202
- "target": "{phase or scratch ID}",
203
- "generated_at": "{ISO}",
204
- "tests": [...],
205
- "coverage": {
206
- "requirements_mapped": ["SC-001"],
207
- "requirements_unmapped": ["SC-003"]
208
- }
209
- }
210
- ```
211
-
212
- Focus on USER-OBSERVABLE outcomes. Skip internal/non-observable items.
213
-
214
- ### Step 6: Create UAT File
215
-
216
- Archive previous `uat.md` to `.history/` if exists.
217
-
218
- Write `{target_dir}/uat.md`:
219
- ```markdown
220
- ---
221
- status: testing
222
- target: {phase slug or scratch ID}
223
- source: [list of summary files]
224
- started: {ISO}
225
- updated: {ISO}
226
- ---
227
-
228
- ## Current Test
229
- number: 1
230
- name: {first test name}
231
- expected: |
232
- {what user should observe}
233
- awaiting: user response
234
-
235
- ## Smoke Tests
236
- {results if ran, otherwise omitted}
237
-
238
- ## Tests
239
- ### 1. {Test Name}
240
- expected: {observable behavior}
241
- result: [pending]
242
-
243
- ## Summary
244
- total: {N} passed: 0 issues: 0 pending: {N} skipped: 0
245
-
246
- ## Gaps
247
- [none yet]
248
- ```
249
-
250
- ### Step 7: Present Test (Interactive Loop)
251
-
252
- Present one test at a time:
253
- ```
254
- ------------------------------------------------------------
255
- TEST {number}/{total}: {name}
256
- ------------------------------------------------------------
257
-
258
- Expected behavior:
259
- {expected}
260
-
261
- ------------------------------------------------------------
262
- > Type "pass" or describe what's wrong
263
- ------------------------------------------------------------
264
- ```
265
-
266
- Wait for user response (plain text).
267
-
268
- ### Step 8: Process Response
269
-
270
- | Response | Action |
271
- |----------|--------|
272
- | empty, "yes", "y", "ok", "pass", "next" | Mark as pass |
273
- | "skip", "can't test", "n/a" | Mark as skipped |
274
- | Anything else | Log as issue, infer severity |
275
-
276
- **Severity inference** (never ask):
277
-
278
- | User says | Infer |
279
- |-----------|-------|
280
- | "crashes", "error", "exception", "fails completely", "can't use" | blocker |
281
- | "doesn't work", "nothing happens", "wrong behavior", "broken" | major |
282
- | "works but...", "slow", "weird", "minor issue", "inconsistent" | minor |
283
- | "color", "spacing", "alignment", "looks off", "typo" | cosmetic |
284
-
285
- Default: **major** if unclear.
286
-
287
- **On issue**: auto-create issue in `.workflow/issues/issues.jsonl`:
288
- ```json
289
- {
290
- "id": "ISS-{YYYYMMDD}-{NNN}",
291
- "title": "UAT: {test.name} - {response truncated 100 chars}",
292
- "status": "registered",
293
- "priority": "{from severity}",
294
- "severity": "{inferred}",
295
- "source": "uat",
296
- "phase_ref": "{phase}",
297
- "gap_ref": "{test.id}",
298
- "description": "Expected: {expected}. Reported: {verbatim}",
299
- "tags": ["uat"]
300
- }
301
- ```
302
-
303
- Back-reference: set `gap.issue_id = issue_id` in uat.md gap entry.
304
-
305
- **Batched writes**: write to file on issue, every 5 passes, or completion.
306
-
307
- If more tests → update Current Test, loop to Step 7.
308
- If done → go to Step 10.
309
-
310
- ### Step 9: Resume From File
311
-
312
- Read `uat.md`, find first `result: [pending]` test, announce progress, continue from there (go to Step 7).
313
-
314
- ### Step 10: Complete Session
315
-
316
- 1. Update `uat.md` frontmatter: `status → "complete"`, update timestamp
317
- 2. Archive previous result artifacts to `.history/`
318
- 3. Write `.tests/test-results.json`:
319
- ```json
320
- { "target": "...", "completed_at": "...", "results": [...], "summary": { "total": N, "passed": N, "issues": N, "skipped": N } }
321
- ```
322
- 4. Write `.tests/coverage-report.json`:
323
- ```json
324
- { "target": "...", "requirements_covered": [...], "requirements_uncovered": [...], "coverage_percentage": 66.7 }
325
- ```
326
- 5. Update `index.json` with UAT results
327
- 6. **Register artifact** in `state.json.artifacts[]`:
328
- ```json
329
- { "id": "TST-NNN", "type": "test", "milestone": "current", "phase": "target_phase", "scope": "phase",
330
- "path": "scratch/{YYYYMMDD}-test-P{N}-{slug}", "status": "completed|failed", "depends_on": "exec_art.id" }
331
- ```
332
- 7. If no issues → go to Step 13
333
- 8. If issues found → go to Step 11
334
-
335
- ### Step 11: Auto-Diagnose via CSV Parallel
336
-
337
- **Cluster gaps and diagnose in parallel via `spawn_agents_on_csv`.**
338
-
339
- #### 11a. Cluster Gaps
340
-
341
- Group issues by affected component/area:
342
- - Same file/module → one cluster
343
- - Same feature/flow → one cluster
344
- - Unrelated → separate clusters
345
-
346
- #### 11b. Build diagnosis.csv
347
-
348
- ```
349
- mkdir -p {target_dir}/.tests/.csv-session
350
-
351
- For each gap in uat.md:
352
- Resolve target_files from gap context (test expected behavior → source files)
353
- Gather source_context (imports, exports, call chains from target files)
354
- Create one diagnosis.csv row with: id, test_id, cluster, test_name, expected, reported, severity, target_files, issue_id, source_context
355
- ```
356
-
357
- #### 11c. Parallel Diagnosis via spawn_agents_on_csv
358
-
359
- ```javascript
360
- spawn_agents_on_csv({
361
- csv_path: `${targetDir}/.tests/.csv-session/diagnosis.csv`,
362
- id_column: "id",
363
- instruction: `
364
- You are a UAT failure diagnostician. Investigate ONE gap cluster.
365
-
366
- ## Task
367
- - Read all target_files to understand the relevant code
368
- - Analyze: why does the expected behavior not match what user reported?
369
- - Find the root cause (not the symptom)
370
- - Suggest a fix direction (what needs to change, not exact code)
371
- - List all files that would need modification
372
-
373
- ## Output
374
- - root_cause: Concise explanation of why the issue occurs
375
- - fix_direction: Suggested approach to fix (e.g., "Add null check before accessing user.email")
376
- - affected_files: Semicolon-separated list of files needing changes
377
- - evidence: file:line references supporting your diagnosis
378
-
379
- ## Rules
380
- - Do NOT modify any files — diagnosis only
381
- - Focus on root cause, not symptoms
382
- - Reference issue_id in your findings for traceability
383
- - If multiple gaps in same cluster share a root cause, note the shared cause
384
- `,
385
- max_concurrency: 5,
386
- max_runtime_seconds: 1200,
387
- output_csv_path: `${targetDir}/.tests/.csv-session/diagnosis-results.csv`,
388
- output_schema: { id, root_cause, fix_direction, affected_files, evidence, error }
389
- })
390
- ```
391
-
392
- #### 11d. Merge Results
393
-
394
- Update `uat.md` gaps with diagnosis:
395
- ```yaml
396
- - test: {N}
397
- truth: "..."
398
- status: failed
399
- reason: "..."
400
- severity: {inferred}
401
- issue_id: ISS-YYYYMMDD-NNN
402
- root_cause: "{diagnosed cause}"
403
- fix_direction: "{suggested approach}"
404
- affected_files: ["{file1}", "{file2}"]
405
- ```
406
-
407
- ### Step 12: Gap Closure Decision
408
-
409
- **If `--auto-fix` or `-y`**: execute gap-fix loop directly.
410
-
411
- **Otherwise**: present diagnosis summary and offer options:
412
- ```
413
- ### Diagnosis Complete
414
-
415
- | Gap | Severity | Root Cause | Fix Direction |
416
- |-----|----------|------------|---------------|
417
- | T-3 | major | Missing null check | Add guard clause |
418
- | T-5 | blocker | Event not cleaned | Add cleanup logic |
419
-
420
- Options:
421
- 1. Auto-fix — Plan and execute fixes, then re-verify
422
- 2. Debug deep — $quality-debug per issue
423
- 3. Plan fixes — $maestro-plan "--gaps"
424
- 4. Manual fix — Address issues yourself
425
- ```
426
-
427
- | Choice | Action |
428
- |--------|--------|
429
- | 1 / "auto-fix" | Execute gap-fix loop |
430
- | 2 / "debug" | Suggest `$quality-debug "--from-uat {phase}"` |
431
- | 3 / "plan" | Suggest `$maestro-plan "{phase} --gaps"` |
432
- | 4 / "manual" | Done, report results |
433
-
434
- **Gap-fix closure loop** (max 2 iterations):
435
- 1. `$maestro-plan "{phase} --gaps"` — generate fix tasks from gaps
436
- 2. `$maestro-execute "{phase}"` — execute fix tasks
437
- 3. `$maestro-verify "{phase}"` — re-verify
438
-
439
- **Issue lifecycle sync during loop:**
440
- - Before plan: `registered` → `planning`
441
- - Before execute: `planning` → `executing`
442
- - After re-verify: resolved gaps → `completed` (resolution: "auto-fixed via gap-fix loop"), unresolved → `failed`
443
-
444
- If re-verify passes: update uat.md gaps as resolved, report success.
445
- If gaps remain after 2 iterations: report remaining, suggest manual intervention.
446
-
447
- ### Step 12.5: UAT Confidence Scoring
448
-
449
- Dimensions (4): scenario_coverage, diagnostic_depth, observation_quality, closure_completeness. Factors (weights): requirements_mapped(.30), observation_specificity(.25), user_validation(.20), diagnostic_depth(.15), consistency(.10). Append confidence summary to `uat.md`.
450
-
451
- **Readiness gate** (before final report): Block if scenario_coverage < 40% or any blocker-severity gap without diagnosis.
452
-
453
- ### Step 13: Report
454
-
455
- ```
456
- === UAT RESULTS ===
457
- Target: {target}
458
-
459
- Smoke Tests: {smoke_count} run, {smoke_pass} passed (if ran)
460
- UAT Tests: {total} total
461
- Passed: {passed}
462
- Issues: {issues} ({blocker_count} blockers, {major_count} major)
463
- Skipped: {skipped}
464
-
465
- Diagnosis: {diagnosed_count}/{issues} gaps diagnosed
466
- Auto-fix: {fixed_count} gaps resolved (if ran)
467
-
468
- Files:
469
- {target_dir}/uat.md
470
- {target_dir}/.tests/test-results.json
471
- {target_dir}/.tests/coverage-report.json
472
- {target_dir}/.tests/.csv-session/diagnosis-results.csv (if diagnosed)
473
- ```
474
-
475
- **Next-step routing:**
476
-
477
- | Result | Next Step |
478
- |--------|-----------|
479
- | All passed, no gaps | `$maestro-milestone-audit` |
480
- | Auto-fix ran and succeeded | `$maestro-verify "{phase}"` |
481
- | Auto-fix ran but gaps remain | `$quality-debug "--from-uat {phase}"` |
482
- | Issues found, manual fix needed | `$quality-debug "--from-uat {phase}"` |
483
- | Coverage below threshold | `$quality-auto-test "{phase}"` |
484
- | Need integration tests | `$quality-auto-test "{phase}"` |
485
-
486
- </execution>
487
-
488
- <error_codes>
489
- | Code | Severity | Condition | Recovery |
490
- |------|----------|-----------|----------|
491
- | E001 | error | Phase or task target required (no active sessions) | Prompt user for phase number |
492
- | E002 | error | Phase not verified (no verification.json) | Suggest `$maestro-verify` |
493
- | E003 | error | Smoke test failed (app won't start) | Suggest `$quality-debug` |
494
- | W001 | warning | Test scenarios failed | Auto-diagnose, suggest fix options |
495
- | W002 | warning | Coverage below threshold | Suggest `$quality-auto-test` |
496
- </error_codes>
497
-
498
- <success_criteria>
499
- - [ ] Target resolved and verification context loaded
500
- - [ ] Quality artifacts loaded (review findings → extra tests, debug root causes → regression tests)
501
- - [ ] Test scenarios designed from user-observable outcomes
502
- - [ ] UAT file created with session persistence
503
- - [ ] Tests presented one at a time, severity inferred (never asked)
504
- - [ ] Issues auto-created in issues.jsonl for all failures
505
- - [ ] Batched writes: on issue, every 5 passes, or completion
506
- - [ ] test-results.json and coverage-report.json written
507
- - [ ] index.json uat fields updated
508
- - [ ] Artifact registered in state.json
509
- - [ ] UAT confidence scored with 4-dimension factor model
510
- - [ ] Readiness gate checked before final report
511
- - [ ] Confidence summary appended to uat.md
512
- - [ ] If issues: diagnosis.csv built, spawn_agents_on_csv executed per gap cluster
513
- - [ ] Gaps updated with root_cause, fix_direction, affected_files
514
- - [ ] Gap-fix loop triggered if --auto-fix (max 2 iterations)
515
- - [ ] Issue lifecycle synced through gap-fix loop
516
- - [ ] Next step routed based on result
517
- </success_criteria>
1
+ ---
2
+ name: quality-test
3
+ description: Conversational UAT with auto-diagnosis and gap closure
4
+ argument-hint: "<phase> [-y] [--smoke] [--auto-fix] [--session ID]"
5
+ allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
6
+ ---
7
+
8
+ <purpose>
9
+ Conversational UAT: present expected behavior one test at a time, user confirms or describes issues. Severity inferred from natural language (never asked). Session persists in `uat.md` across context resets. Failed tests trigger CSV-parallel diagnosis via `spawn_agents_on_csv` and optional gap-fix closure (max 2 iterations).
10
+
11
+ **Philosophy**: Show expected, ask if reality matches.
12
+ </purpose>
13
+
14
+ <context>
15
+ $ARGUMENTS -- phase number and optional flags.
16
+
17
+ **Flags**:
18
+ - `<phase>`: Phase number or scratch task ID
19
+ - `--smoke`: Run cold-start smoke tests before UAT
20
+ - `--auto-fix`: Auto-trigger gap-fix loop (plan --gaps -> execute -> re-verify)
21
+ - `--session ID`: Resume specific UAT session
22
+ - `-y`: Implies --auto-fix; UAT itself stays interactive
23
+
24
+ **Output**:
25
+ - `{target_dir}/uat.md` -- session file (persistent)
26
+ - `{target_dir}/.tests/test-plan.json` -- scenario definitions
27
+ - `{target_dir}/.tests/test-results.json` -- pass/fail results
28
+ - `{target_dir}/.tests/coverage-report.json` -- requirement coverage
29
+ - `.tests/.csv-session/diagnosis.csv` + `diagnosis-results.csv`
30
+ </context>
31
+
32
+ <csv_schema>
33
+
34
+ ### diagnosis.csv (Gap Diagnosis Phase)
35
+
36
+ ```csv
37
+ id,test_id,cluster,test_name,expected,reported,severity,target_files,issue_id,source_context,root_cause,fix_direction,affected_files,evidence,error
38
+ "DX-001","T-003","auth","Login validation","Valid login returns dashboard","Clicking login does nothing","major","src/auth/login.ts;src/routes/auth.ts","ISS-20260503-001","login.ts calls authService.verify","","","","",""
39
+ ```
40
+
41
+ Input: id, test_id, cluster, test_name, expected, reported, severity, target_files, issue_id, source_context.
42
+ Output: root_cause, fix_direction, affected_files, evidence, error.
43
+ </csv_schema>
44
+
45
+ <invariants>
46
+ 1. **One test at a time** -- never batch-present tests
47
+ 2. **Never ask severity** -- always infer from natural language
48
+ 3. **Session persistence** -- uat.md survives context resets, resume from any point
49
+ 4. **Batched writes** -- on issue, every 5 passes, or completion
50
+ 5. **Gap-fix loop max 2 iterations** -- prevent infinite loops
51
+ 6. **CSV parallel diagnosis** -- spawn_agents_on_csv for gap clusters, not sequential
52
+ 7. **Auto-create issues** -- every failed test -> issues.jsonl entry
53
+ 8. **Issue lifecycle sync** -- registered -> planning -> executing -> completed/failed
54
+ </invariants>
55
+
56
+ <state_machine>
57
+
58
+ <states>
59
+ S_RESOLVE -- 解析目标(phase/scratch)、检查活跃 session PERSIST: --
60
+ S_SMOKE -- 冒烟测试(--smoke, 可跳过) PERSIST: uat.md smoke section
61
+ S_DESIGN -- 设计测试场景(verification context + quality artifacts) PERSIST: test-plan.json
62
+ S_CREATE_UAT -- 创建 uat.md PERSIST: uat.md
63
+ S_PRESENT -- 逐个呈现测试、收集用户反馈 PERSIST: uat.md (batched)
64
+ S_COMPLETE -- 完成 session、写结果文件 PERSIST: test-results.json + coverage-report.json
65
+ S_DIAGNOSE -- CSV 并行诊断 gap clusters PERSIST: diagnosis-results.csv
66
+ S_GAP_CLOSE -- Gap 修复循环(--auto-fix, max 2 iter) PERSIST: uat.md gaps updated
67
+ S_REPORT -- 最终报告、路由下一步 PERSIST: --
68
+ </states>
69
+
70
+ <transitions>
71
+
72
+ S_RESOLVE:
73
+ -> S_PRESENT WHEN: --session ID (resume from uat.md)
74
+ -> S_SMOKE WHEN: --smoke flag set
75
+ -> S_DESIGN WHEN: normal flow DO: resolve target, validate verification.json exists
76
+
77
+ S_SMOKE:
78
+ -> S_DESIGN WHEN: all pass
79
+ -> ERROR(E003) WHEN: any fail (suggest quality-debug)
80
+
81
+ **Smoke checks**:
82
+ | Test | Method |
83
+ |------|--------|
84
+ | App starts | Run start command, check exit code |
85
+ | Routes respond | curl/fetch main routes, check non-error |
86
+ | Build clean | Build command succeeds |
87
+ | Dependencies | Install check, no missing deps |
88
+
89
+ S_DESIGN:
90
+ -> S_CREATE_UAT DO: A_DESIGN_SCENARIOS
91
+
92
+ S_CREATE_UAT:
93
+ -> S_PRESENT DO: write uat.md (archive previous to .history/ if exists)
94
+
95
+ **uat.md template**:
96
+ ```markdown
97
+ ---
98
+ status: testing
99
+ target: {phase slug or scratch ID}
100
+ source: [list of summary files]
101
+ started: {ISO}
102
+ updated: {ISO}
103
+ ---
104
+ ## Current Test
105
+ number: 1
106
+ name: {first test}
107
+ expected: |
108
+ {observable behavior}
109
+ awaiting: user response
110
+
111
+ ## Tests
112
+ ### 1. {Test Name}
113
+ expected: {behavior}
114
+ result: [pending]
115
+
116
+ ## Summary
117
+ total: {N} passed: 0 issues: 0 pending: {N} skipped: 0
118
+
119
+ ## Gaps
120
+ [none yet]
121
+ ```
122
+
123
+ S_PRESENT:
124
+ -> S_PRESENT WHEN: more tests DO: A_PRESENT_AND_PROCESS
125
+ -> S_COMPLETE WHEN: all tests done
126
+
127
+ S_COMPLETE:
128
+ -> S_DIAGNOSE WHEN: issues found DO: write output files, register artifact in state.json
129
+ -> S_REPORT WHEN: no issues
130
+
131
+ **test-results.json**: `{ target, completed_at, results: [...], summary: { total, passed, issues, skipped } }`
132
+ **coverage-report.json**: `{ target, requirements_covered: [...], requirements_uncovered: [...], coverage_percentage }`
133
+
134
+ S_DIAGNOSE:
135
+ -> S_GAP_CLOSE DO: A_DIAGNOSE_GAPS
136
+
137
+ S_GAP_CLOSE:
138
+ -> S_REPORT WHEN: --auto-fix not set DO: present options (auto-fix/debug/plan/manual)
139
+ -> S_REPORT WHEN: --auto-fix, loop done DO: A_GAP_FIX_LOOP (plan --gaps -> execute -> verify, max 2)
140
+
141
+ S_REPORT:
142
+ -> END DO: A_REPORT
143
+
144
+ </transitions>
145
+
146
+ <actions>
147
+
148
+ ### A_DESIGN_SCENARIOS
149
+
150
+ 1. Load verification context: verification.json, validation.json, index.json, plan.json, summaries
151
+ 2. Load quality artifacts: review findings (type=review) -> extra tests (source: review_finding); debug root causes (type=debug) -> regression tests (source: debug_root_cause)
152
+ 3. Load test tools: `maestro spec load --category test --keyword <feature>` -> additional scenarios (source: tool)
153
+ 4. Design scenarios from user-observable outcomes: id (T-NNN), name, category (e2e/integration/unit), expected, requirement_ref, source
154
+ 5. Write test-plan.json
155
+
156
+ ### A_PRESENT_AND_PROCESS
157
+
158
+ Present one test: `TEST {n}/{total}: {name}` + expected behavior + prompt.
159
+
160
+ Response processing:
161
+
162
+ | Response | Action |
163
+ |----------|--------|
164
+ | "pass"/"yes"/"ok"/"next"/empty | Mark pass |
165
+ | "skip"/"can't test"/"n/a" | Mark skipped |
166
+ | Anything else | Log issue, infer severity |
167
+
168
+ **Severity inference** (never ask):
169
+ - crashes/error/exception/fails/can't use -> blocker
170
+ - doesn't work/nothing happens/wrong/broken -> major
171
+ - works but.../slow/weird/minor/inconsistent -> minor
172
+ - color/spacing/alignment/looks off/typo -> cosmetic
173
+ - Default: major
174
+
175
+ On issue: auto-create in `.workflow/issues/issues.jsonl`:
176
+ ```json
177
+ { "id": "ISS-{YYYYMMDD}-{NNN}", "title": "UAT: {test.name} - {response truncated 100}",
178
+ "status": "registered", "priority": "{from severity}", "severity": "{inferred}",
179
+ "source": "uat", "phase_ref": "{phase}", "gap_ref": "{test.id}",
180
+ "description": "Expected: {expected}. Reported: {verbatim}", "tags": ["uat"] }
181
+ ```
182
+
183
+ ### A_DIAGNOSE_GAPS
184
+
185
+ 1. Cluster gaps by component/module/feature
186
+ 2. Build diagnosis.csv: one row per gap with target_files, source_context
187
+ 3. `spawn_agents_on_csv` for parallel diagnosis
188
+ 4. **Diagnosis agent**: Find root cause (not symptom), suggest fix direction, list affected files. Do NOT modify files. Reference issue_id for traceability.
189
+ 5. Merge results: update uat.md gaps with root_cause, fix_direction, affected_files
190
+
191
+ ### A_GAP_FIX_LOOP
192
+
193
+ Max 2 iterations:
194
+ 1. `maestro-plan "{phase} --gaps"` (registered -> planning)
195
+ 2. `maestro-execute "{phase}"` (planning -> executing)
196
+ 3. `maestro-verify "{phase}"` (resolved -> completed, unresolved -> failed)
197
+
198
+ ### A_REPORT
199
+
200
+ 1. UAT confidence scoring (4 dims: scenario_coverage, diagnostic_depth, observation_quality, closure_completeness). Readiness gate: block if scenario_coverage < 40% or blocker without diagnosis.
201
+ 2. Display summary: smoke results, pass/fail/skip counts, diagnosis stats, auto-fix results
202
+ 3. Register artifact in state.json (type: test)
203
+ 4. Route: all passed -> milestone-audit; auto-fix succeeded -> maestro-verify; gaps remain -> quality-debug; low coverage -> quality-auto-test
204
+
205
+ </actions>
206
+
207
+ </state_machine>
208
+
209
+ <error_codes>
210
+ | Condition | Recovery |
211
+ |-----------|----------|
212
+ | Phase/task target required, no active sessions | Prompt for phase number |
213
+ | No verification.json (phase not verified) | Suggest maestro-verify |
214
+ | Smoke test failed (app won't start) | Suggest quality-debug |
215
+ | Coverage below threshold | Suggest quality-auto-test |
216
+ </error_codes>
217
+
218
+ <success_criteria>
219
+ - [ ] Tests presented one at a time, severity inferred (never asked)
220
+ - [ ] Issues auto-created in issues.jsonl for all failures
221
+ - [ ] uat.md persists across context resets
222
+ - [ ] Quality artifacts loaded (review -> extra tests, debug -> regression tests)
223
+ - [ ] CSV parallel diagnosis via spawn_agents_on_csv
224
+ - [ ] Gap-fix loop max 2 iterations (if --auto-fix)
225
+ - [ ] Issue lifecycle synced through gap-fix loop
226
+ - [ ] UAT confidence scored (4-dimension model)
227
+ - [ ] test-results.json + coverage-report.json written
228
+ </success_criteria>
229
+ </output>