maestro-flow 0.5.3 → 0.5.31

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (259) hide show
  1. package/.agents/skills/learn-follow/SKILL.md +114 -114
  2. package/.agents/skills/learn-investigate/SKILL.md +138 -139
  3. package/.agents/skills/learn-second-opinion/SKILL.md +105 -109
  4. package/.agents/skills/maestro/SKILL.md +2 -10
  5. package/.agents/skills/maestro-amend/SKILL.md +152 -152
  6. package/.agents/skills/maestro-analyze/SKILL.md +201 -252
  7. package/.agents/skills/maestro-blueprint/SKILL.md +175 -190
  8. package/.agents/skills/maestro-brainstorm/SKILL.md +196 -200
  9. package/.agents/skills/maestro-collab/SKILL.md +159 -159
  10. package/.agents/skills/maestro-companion/SKILL.md +517 -517
  11. package/.agents/skills/maestro-composer/SKILL.md +173 -164
  12. package/.agents/skills/maestro-execute/SKILL.md +169 -170
  13. package/.agents/skills/maestro-fork/SKILL.md +97 -96
  14. package/.agents/skills/maestro-grill/SKILL.md +161 -162
  15. package/.agents/skills/maestro-guard/SKILL.md +93 -92
  16. package/.agents/skills/maestro-impeccable/SKILL.md +296 -253
  17. package/.agents/skills/maestro-init/SKILL.md +117 -118
  18. package/.agents/skills/maestro-merge/SKILL.md +73 -66
  19. package/.agents/skills/maestro-milestone-audit/SKILL.md +4 -10
  20. package/.agents/skills/maestro-milestone-complete/SKILL.md +6 -7
  21. package/.agents/skills/maestro-milestone-release/SKILL.md +122 -131
  22. package/.agents/skills/maestro-next/SKILL.md +241 -245
  23. package/.agents/skills/maestro-overlay/SKILL.md +176 -166
  24. package/.agents/skills/maestro-plan/SKILL.md +211 -197
  25. package/.agents/skills/maestro-player/SKILL.md +167 -167
  26. package/.agents/skills/maestro-quick/SKILL.md +69 -63
  27. package/.agents/skills/maestro-ralph/SKILL.md +2 -36
  28. package/.agents/skills/maestro-ralph-beta/SKILL.md +861 -872
  29. package/.agents/skills/maestro-ralph-execute/SKILL.md +234 -234
  30. package/.agents/skills/maestro-roadmap/SKILL.md +159 -172
  31. package/.agents/skills/maestro-swarm-workflow/SKILL.md +229 -250
  32. package/.agents/skills/maestro-tools-execute/SKILL.md +108 -103
  33. package/.agents/skills/maestro-tools-register/SKILL.md +148 -143
  34. package/.agents/skills/maestro-ui-codify/SKILL.md +103 -86
  35. package/.agents/skills/maestro-universal-workflow/SKILL.md +534 -547
  36. package/.agents/skills/maestro-update/SKILL.md +109 -106
  37. package/.agents/skills/manage-codebase-rebuild/SKILL.md +73 -71
  38. package/.agents/skills/manage-harvest/SKILL.md +83 -81
  39. package/.agents/skills/manage-issue/SKILL.md +59 -60
  40. package/.agents/skills/manage-issue-discover/SKILL.md +70 -68
  41. package/.agents/skills/manage-kg-extractors/SKILL.md +130 -0
  42. package/.agents/skills/manage-knowhow/SKILL.md +70 -66
  43. package/.agents/skills/manage-knowhow-capture/SKILL.md +79 -69
  44. package/.agents/skills/manage-knowledge-audit/SKILL.md +91 -74
  45. package/.agents/skills/manage-status/SKILL.md +52 -42
  46. package/.agents/skills/manage-wiki/SKILL.md +69 -58
  47. package/.agents/skills/odyssey-debug/SKILL.md +445 -459
  48. package/.agents/skills/odyssey-improve/SKILL.md +477 -491
  49. package/.agents/skills/odyssey-planex/SKILL.md +576 -587
  50. package/.agents/skills/odyssey-review-test-fix/SKILL.md +400 -413
  51. package/.agents/skills/odyssey-ui/SKILL.md +431 -448
  52. package/.agents/skills/quality-auto-test/SKILL.md +140 -123
  53. package/.agents/skills/quality-debug/SKILL.md +145 -106
  54. package/.agents/skills/quality-refactor/SKILL.md +91 -53
  55. package/.agents/skills/quality-retrospective/SKILL.md +109 -63
  56. package/.agents/skills/quality-review/SKILL.md +141 -114
  57. package/.agents/skills/quality-sync/SKILL.md +74 -38
  58. package/.agents/skills/quality-test/SKILL.md +133 -103
  59. package/.agents/skills/security-audit/SKILL.md +217 -166
  60. package/.agents/skills/spec-add/SKILL.md +66 -59
  61. package/.agents/skills/spec-load/SKILL.md +68 -68
  62. package/.agents/skills/spec-remove/SKILL.md +42 -42
  63. package/.agents/skills/spec-setup/SKILL.md +38 -41
  64. package/.agy/skills/learn-follow/SKILL.md +114 -114
  65. package/.agy/skills/learn-investigate/SKILL.md +138 -139
  66. package/.agy/skills/learn-second-opinion/SKILL.md +105 -109
  67. package/.agy/skills/maestro/SKILL.md +2 -10
  68. package/.agy/skills/maestro-amend/SKILL.md +152 -152
  69. package/.agy/skills/maestro-analyze/SKILL.md +201 -252
  70. package/.agy/skills/maestro-blueprint/SKILL.md +175 -190
  71. package/.agy/skills/maestro-brainstorm/SKILL.md +196 -200
  72. package/.agy/skills/maestro-collab/SKILL.md +159 -159
  73. package/.agy/skills/maestro-companion/SKILL.md +517 -517
  74. package/.agy/skills/maestro-composer/SKILL.md +173 -164
  75. package/.agy/skills/maestro-execute/SKILL.md +169 -170
  76. package/.agy/skills/maestro-fork/SKILL.md +97 -96
  77. package/.agy/skills/maestro-grill/SKILL.md +161 -162
  78. package/.agy/skills/maestro-guard/SKILL.md +93 -92
  79. package/.agy/skills/maestro-impeccable/SKILL.md +296 -253
  80. package/.agy/skills/maestro-init/SKILL.md +117 -118
  81. package/.agy/skills/maestro-merge/SKILL.md +73 -66
  82. package/.agy/skills/maestro-milestone-audit/SKILL.md +4 -10
  83. package/.agy/skills/maestro-milestone-complete/SKILL.md +6 -7
  84. package/.agy/skills/maestro-milestone-release/SKILL.md +122 -131
  85. package/.agy/skills/maestro-next/SKILL.md +241 -245
  86. package/.agy/skills/maestro-overlay/SKILL.md +176 -166
  87. package/.agy/skills/maestro-plan/SKILL.md +211 -197
  88. package/.agy/skills/maestro-player/SKILL.md +167 -167
  89. package/.agy/skills/maestro-quick/SKILL.md +69 -63
  90. package/.agy/skills/maestro-ralph/SKILL.md +2 -36
  91. package/.agy/skills/maestro-ralph-beta/SKILL.md +861 -872
  92. package/.agy/skills/maestro-ralph-execute/SKILL.md +234 -234
  93. package/.agy/skills/maestro-roadmap/SKILL.md +159 -172
  94. package/.agy/skills/maestro-swarm-workflow/SKILL.md +229 -250
  95. package/.agy/skills/maestro-tools-execute/SKILL.md +108 -103
  96. package/.agy/skills/maestro-tools-register/SKILL.md +148 -143
  97. package/.agy/skills/maestro-ui-codify/SKILL.md +103 -86
  98. package/.agy/skills/maestro-universal-workflow/SKILL.md +534 -547
  99. package/.agy/skills/maestro-update/SKILL.md +109 -106
  100. package/.agy/skills/manage-codebase-rebuild/SKILL.md +73 -71
  101. package/.agy/skills/manage-harvest/SKILL.md +83 -81
  102. package/.agy/skills/manage-issue/SKILL.md +59 -60
  103. package/.agy/skills/manage-issue-discover/SKILL.md +70 -68
  104. package/.agy/skills/manage-kg-extractors/SKILL.md +130 -0
  105. package/.agy/skills/manage-knowhow/SKILL.md +70 -66
  106. package/.agy/skills/manage-knowhow-capture/SKILL.md +79 -69
  107. package/.agy/skills/manage-knowledge-audit/SKILL.md +91 -74
  108. package/.agy/skills/manage-status/SKILL.md +52 -42
  109. package/.agy/skills/manage-wiki/SKILL.md +69 -58
  110. package/.agy/skills/odyssey-debug/SKILL.md +445 -459
  111. package/.agy/skills/odyssey-improve/SKILL.md +477 -491
  112. package/.agy/skills/odyssey-planex/SKILL.md +576 -587
  113. package/.agy/skills/odyssey-review-test-fix/SKILL.md +400 -413
  114. package/.agy/skills/odyssey-ui/SKILL.md +431 -448
  115. package/.agy/skills/quality-auto-test/SKILL.md +140 -123
  116. package/.agy/skills/quality-debug/SKILL.md +145 -106
  117. package/.agy/skills/quality-refactor/SKILL.md +91 -53
  118. package/.agy/skills/quality-retrospective/SKILL.md +109 -63
  119. package/.agy/skills/quality-review/SKILL.md +141 -114
  120. package/.agy/skills/quality-sync/SKILL.md +74 -38
  121. package/.agy/skills/quality-test/SKILL.md +133 -103
  122. package/.agy/skills/security-audit/SKILL.md +217 -166
  123. package/.agy/skills/spec-add/SKILL.md +66 -59
  124. package/.agy/skills/spec-load/SKILL.md +68 -68
  125. package/.agy/skills/spec-remove/SKILL.md +42 -42
  126. package/.agy/skills/spec-setup/SKILL.md +38 -41
  127. package/.claude/commands/learn-follow.md +127 -127
  128. package/.claude/commands/learn-investigate.md +151 -152
  129. package/.claude/commands/learn-second-opinion.md +118 -122
  130. package/.claude/commands/maestro-amend.md +164 -164
  131. package/.claude/commands/maestro-analyze.md +215 -266
  132. package/.claude/commands/maestro-blueprint.md +189 -204
  133. package/.claude/commands/maestro-brainstorm.md +209 -213
  134. package/.claude/commands/maestro-collab.md +172 -172
  135. package/.claude/commands/maestro-companion.md +531 -531
  136. package/.claude/commands/maestro-composer.md +188 -179
  137. package/.claude/commands/maestro-execute.md +183 -184
  138. package/.claude/commands/maestro-fork.md +111 -110
  139. package/.claude/commands/maestro-grill.md +175 -176
  140. package/.claude/commands/maestro-guard.md +103 -102
  141. package/.claude/commands/maestro-impeccable.md +311 -268
  142. package/.claude/commands/maestro-init.md +130 -131
  143. package/.claude/commands/maestro-merge.md +87 -80
  144. package/.claude/commands/maestro-milestone-audit.md +4 -10
  145. package/.claude/commands/maestro-milestone-complete.md +6 -7
  146. package/.claude/commands/maestro-milestone-release.md +136 -145
  147. package/.claude/commands/maestro-next.md +253 -257
  148. package/.claude/commands/maestro-overlay.md +188 -178
  149. package/.claude/commands/maestro-plan.md +225 -211
  150. package/.claude/commands/maestro-player.md +182 -182
  151. package/.claude/commands/maestro-quick.md +83 -77
  152. package/.claude/commands/maestro-ralph-beta.md +875 -886
  153. package/.claude/commands/maestro-ralph-execute.md +247 -247
  154. package/.claude/commands/maestro-ralph.md +2 -36
  155. package/.claude/commands/maestro-roadmap.md +173 -186
  156. package/.claude/commands/maestro-swarm-workflow.md +243 -264
  157. package/.claude/commands/maestro-tools-execute.md +122 -117
  158. package/.claude/commands/maestro-tools-register.md +162 -157
  159. package/.claude/commands/maestro-ui-codify.md +117 -100
  160. package/.claude/commands/maestro-universal-workflow.md +548 -561
  161. package/.claude/commands/maestro-update.md +122 -119
  162. package/.claude/commands/maestro.md +2 -10
  163. package/.claude/commands/manage-codebase-rebuild.md +87 -85
  164. package/.claude/commands/manage-harvest.md +97 -95
  165. package/.claude/commands/manage-issue-discover.md +83 -81
  166. package/.claude/commands/manage-issue.md +72 -73
  167. package/.claude/commands/manage-kg-extractors.md +128 -0
  168. package/.claude/commands/manage-knowhow-capture.md +92 -82
  169. package/.claude/commands/manage-knowhow.md +83 -79
  170. package/.claude/commands/manage-knowledge-audit.md +105 -88
  171. package/.claude/commands/manage-status.md +62 -52
  172. package/.claude/commands/manage-wiki.md +82 -71
  173. package/.claude/commands/odyssey-debug.md +459 -473
  174. package/.claude/commands/odyssey-improve.md +491 -505
  175. package/.claude/commands/odyssey-planex.md +590 -601
  176. package/.claude/commands/odyssey-review-test-fix.md +414 -427
  177. package/.claude/commands/odyssey-ui.md +445 -462
  178. package/.claude/commands/quality-auto-test.md +153 -136
  179. package/.claude/commands/quality-debug.md +159 -120
  180. package/.claude/commands/quality-refactor.md +105 -67
  181. package/.claude/commands/quality-retrospective.md +123 -77
  182. package/.claude/commands/quality-review.md +155 -128
  183. package/.claude/commands/quality-sync.md +88 -52
  184. package/.claude/commands/quality-test.md +147 -117
  185. package/.claude/commands/security-audit.md +230 -179
  186. package/.claude/commands/spec-add.md +77 -70
  187. package/.claude/commands/spec-load.md +78 -78
  188. package/.claude/commands/spec-remove.md +55 -55
  189. package/.claude/commands/spec-setup.md +49 -52
  190. package/dist/src/cli.js +1 -1
  191. package/dist/src/cli.js.map +1 -1
  192. package/dist/src/commands/kg.d.ts.map +1 -1
  193. package/dist/src/commands/kg.js +11 -5
  194. package/dist/src/commands/kg.js.map +1 -1
  195. package/dist/src/graph/kg/extraction/code/code-extractor.d.ts +2 -0
  196. package/dist/src/graph/kg/extraction/code/code-extractor.d.ts.map +1 -1
  197. package/dist/src/graph/kg/extraction/code/code-extractor.js +32 -3
  198. package/dist/src/graph/kg/extraction/code/code-extractor.js.map +1 -1
  199. package/dist/src/graph/kg/extraction/code/plugin-engine.d.ts +35 -0
  200. package/dist/src/graph/kg/extraction/code/plugin-engine.d.ts.map +1 -0
  201. package/dist/src/graph/kg/extraction/code/plugin-engine.js +573 -0
  202. package/dist/src/graph/kg/extraction/code/plugin-engine.js.map +1 -0
  203. package/dist/src/graph/kg/extraction/code/plugin-types.d.ts +95 -0
  204. package/dist/src/graph/kg/extraction/code/plugin-types.d.ts.map +1 -0
  205. package/dist/src/graph/kg/extraction/code/plugin-types.js +5 -0
  206. package/dist/src/graph/kg/extraction/code/plugin-types.js.map +1 -0
  207. package/dist/src/graph/kg/extraction/orchestrator.d.ts.map +1 -1
  208. package/dist/src/graph/kg/extraction/orchestrator.js +17 -5
  209. package/dist/src/graph/kg/extraction/orchestrator.js.map +1 -1
  210. package/dist/src/graph/kg/schema.sql +16 -11
  211. package/dist/src/graph/kg/surface/cli.d.ts.map +1 -1
  212. package/dist/src/graph/kg/surface/cli.js +153 -56
  213. package/dist/src/graph/kg/surface/cli.js.map +1 -1
  214. package/dist/src/hooks/workspace.d.ts +4 -2
  215. package/dist/src/hooks/workspace.d.ts.map +1 -1
  216. package/dist/src/hooks/workspace.js +6 -2
  217. package/dist/src/hooks/workspace.js.map +1 -1
  218. package/package.json +91 -91
  219. package/workflows/analyze.md +25 -49
  220. package/workflows/auto-test.md +699 -699
  221. package/workflows/blueprint.md +403 -431
  222. package/workflows/brainstorm.md +54 -195
  223. package/workflows/business-test.md +570 -570
  224. package/workflows/claude-instructions.md +23 -51
  225. package/workflows/codex-instructions.md +27 -77
  226. package/workflows/coding-philosophy.md +69 -69
  227. package/workflows/command-authoring.md +823 -823
  228. package/workflows/debug.md +43 -98
  229. package/workflows/delegate-usage.md +39 -241
  230. package/workflows/execute.md +4 -53
  231. package/workflows/grill.md +12 -56
  232. package/workflows/harvest.md +22 -68
  233. package/workflows/init.md +148 -148
  234. package/workflows/instruction-authoring-guide.md +97 -0
  235. package/workflows/issue-execute.md +110 -110
  236. package/workflows/issue-gaps-analyze.codex.md +260 -260
  237. package/workflows/issue-gaps-analyze.md +216 -216
  238. package/workflows/issue-plan.md +110 -110
  239. package/workflows/issue.md +338 -346
  240. package/workflows/knowhow.md +0 -32
  241. package/workflows/learn.md +277 -277
  242. package/workflows/maestro-chain-execute.md +20 -20
  243. package/workflows/refactor.md +22 -44
  244. package/workflows/retrospective.md +16 -65
  245. package/workflows/review.md +446 -486
  246. package/workflows/roadmap.md +35 -132
  247. package/workflows/skill-authoring.md +265 -265
  248. package/workflows/spec-generate.md +470 -470
  249. package/workflows/specs-remove.md +104 -104
  250. package/workflows/sync.md +11 -41
  251. package/workflows/test-gen.md +226 -226
  252. package/workflows/test.md +385 -475
  253. package/workflows/ui-design.md +391 -391
  254. package/workflows/ui-style.md +199 -199
  255. package/workflows/wiki-connect.md +151 -151
  256. package/workflows/wiki-digest.md +178 -178
  257. package/workflows/wiki-manage.md +109 -109
  258. package/workflows/cli-tools-usage.md +0 -252
  259. package/workflows/delegate-protocol.codex.md +0 -65
package/workflows/test.md CHANGED
@@ -1,475 +1,385 @@
1
- # Test Workflow (UAT)
2
-
3
- Validate built features through conversational UAT testing with persistent state, auto-diagnosis via parallel debug agents, and gap-fix closure loop.
4
-
5
- User tests, Claude records. One test at a time. Plain text responses.
6
- Severity inferred from natural language -- never ask "how severe is this?"
7
-
8
- **Philosophy: Show expected, ask if reality matches.**
9
-
10
- Claude presents what SHOULD happen. User confirms or describes what's different.
11
- - "yes" / "y" / "next" / empty / "pass" -> pass
12
- - "skip" / "can't test" / "n/a" -> skipped
13
- - Anything else -> logged as issue, severity inferred
14
-
15
- No Pass/Fail buttons. No severity questions. Just: "Here's what should happen. Does it?"
16
-
17
- ---
18
-
19
- ### Step 1: Resolve Target
20
-
21
- Determine test target from $ARGUMENTS:
22
-
23
- **If phase number provided** (e.g., "3"):
24
- - Set `$TARGET_TYPE = "phase"`
25
- - Resolve phase dir: look up `phaseNum` in `.workflow/state.json` artifacts (type=execute), derive `PHASE_DIR = ".workflow/" + art.path`. Error if not found.
26
- - Load `$PHASE_DIR/index.json` for context
27
-
28
- **If scratch task ID provided:**
29
- - Set `$TARGET_TYPE = "scratch"`
30
- - Set `$SCRATCH_DIR = ".workflow/scratch/{id}/"`
31
- - Load `$SCRATCH_DIR/index.json` for context
32
-
33
- **If nothing provided:**
34
- - Check for active UAT sessions (see Step 2)
35
- - If none found, prompt user for phase number or scratch task
36
-
37
- **Flags:**
38
- - `--smoke` -- Run cold-start smoke tests before UAT
39
- - `--auto-fix` -- Auto-trigger gap-fix loop on failures
40
-
41
- Validate target exists and has been verified (verification.json present). (E002)
42
-
43
- ---
44
-
45
- ### Step 2: Check Active Sessions
46
-
47
- ```bash
48
- # Check scratch dirs (resolved via artifact registry) for active UAT sessions
49
- find .workflow/scratch -name "uat.md" -type f 2>/dev/null | head -5
50
- ```
51
-
52
- Read each file's frontmatter (status, target) and Current Test section.
53
-
54
- **If active sessions exist AND no $ARGUMENTS:**
55
-
56
- Display inline:
57
- ```
58
- ## Active UAT Sessions
59
-
60
- | # | Target | Status | Current Test | Progress |
61
- |---|--------|--------|--------------|----------|
62
- | 1 | 04-comments | testing | 3. Reply to Comment | 2/6 |
63
- | 2 | quick-fix-nav | testing | 1. Nav Links | 0/4 |
64
-
65
- Reply with a number to resume, or provide a phase/task to start new.
66
- ```
67
-
68
- Wait for user response.
69
- - Number -> resume that session (go to Step 9: Resume From File)
70
- - Phase/task ID -> new session (go to Step 4: Find Testables)
71
-
72
- **If active sessions exist AND $ARGUMENTS provided:**
73
- Check if session exists for that target. If yes, offer resume or restart.
74
-
75
- **If no active sessions AND no $ARGUMENTS:**
76
- Prompt: "No active UAT sessions. Provide a phase number or scratch task ID to start testing."
77
-
78
- **If no active sessions AND $ARGUMENTS:**
79
- Continue to Step 3 or Step 4.
80
-
81
- ---
82
-
83
- ### Step 3: Run Smoke Tests (if --smoke)
84
-
85
- Skip if --smoke not set.
86
-
87
- Inject basic sanity tests BEFORE UAT scenarios:
88
-
89
- | Smoke Test | Check | Method |
90
- |------------|-------|--------|
91
- | App starts | Process runs without crash | `bash: start command, check exit code` |
92
- | Routes respond | Key endpoints return non-error | `bash: curl/fetch main routes` |
93
- | Build clean | No build errors | `bash: build command succeeds` |
94
- | Dependencies | No missing deps | `bash: install check` |
95
-
96
- Record smoke results in uat.md under `## Smoke Tests` section.
97
- If any smoke test fails: abort UAT, report as blocker, suggest Skill({ skill: "quality-debug" }). (E003)
98
-
99
- ---
100
-
101
- ### Step 4: Load Verification Context
102
-
103
- Read from target directory:
104
- - verification.json -- must_haves with truth/artifact/wiring status
105
- - validation.json -- requirement-to-test mapping
106
- - index.json -- success_criteria
107
- - plan.json -- task overview
108
- - All `.summaries/TASK-*.md` -- execution results
109
-
110
- ```bash
111
- ls "$OUTPUT_DIR/.summaries/"*summary*.md 2>/dev/null
112
- ```
113
-
114
- Build testable list: user-observable outcomes from success_criteria + must_haves + task accomplishments.
115
-
116
- ---
117
-
118
- ### Step 5: Design Test Scenarios
119
-
120
- For each testable item, create a scenario:
121
- - **id**: T-001, T-002, ...
122
- - **name**: Brief test name
123
- - **category**: "e2e" | "integration" | "unit"
124
- - **expected**: Specific observable behavior (what user should see)
125
- - **requirement_ref**: Which success criterion this covers
126
-
127
- Write test-plan.json to `.tests/`:
128
- ```json
129
- {
130
- "target": "{phase or scratch ID}",
131
- "generated_at": "{ISO timestamp}",
132
- "tests": [...],
133
- "coverage": {
134
- "requirements_mapped": ["SC-001"],
135
- "requirements_unmapped": ["SC-003"]
136
- }
137
- }
138
- ```
139
-
140
- ```bash
141
- mkdir -p "$OUTPUT_DIR/.tests"
142
- ```
143
-
144
- Focus on USER-OBSERVABLE outcomes, not implementation details.
145
- Skip internal/non-observable items (refactors, type changes).
146
-
147
- ---
148
-
149
- ### Step 6: Create UAT File
150
-
151
- **Archive previous UAT artifacts** before writing: if `$OUTPUT_DIR/uat.md` exists, move it to `$OUTPUT_DIR/.history/uat-{YYYY-MM-DDTHH-mm-ss}.md`.
152
-
153
- Build test list from test-plan.json. Create file at `$OUTPUT_DIR/uat.md`:
154
-
155
- ```markdown
156
- ---
157
- status: testing
158
- target: {phase slug or scratch ID}
159
- source: [list of summary files]
160
- started: {ISO timestamp}
161
- updated: {ISO timestamp}
162
- ---
163
-
164
- ## Current Test
165
- <!-- OVERWRITE each test - shows where we are -->
166
-
167
- number: 1
168
- name: {first test name}
169
- expected: |
170
- {what user should observe}
171
- awaiting: user response
172
-
173
- ## Smoke Tests
174
- {results if ran, otherwise omitted}
175
-
176
- ## Tests
177
-
178
- ### 1. {Test Name}
179
- expected: {observable behavior}
180
- result: [pending]
181
-
182
- ### 2. {Test Name}
183
- expected: {observable behavior}
184
- result: [pending]
185
-
186
- ...
187
-
188
- ## Summary
189
-
190
- total: {N}
191
- passed: 0
192
- issues: 0
193
- pending: {N}
194
- skipped: 0
195
-
196
- ## Gaps
197
-
198
- [none yet]
199
- ```
200
-
201
- Proceed to Step 7.
202
-
203
- ---
204
-
205
- ### Step 7: Present Test
206
-
207
- Present current test to user (one at a time):
208
-
209
- Read Current Test section from uat.md.
210
-
211
- Display:
212
-
213
- ```
214
- ------------------------------------------------------------
215
- TEST {number}/{total}: {name}
216
- ------------------------------------------------------------
217
-
218
- Expected behavior:
219
- {expected}
220
-
221
- ------------------------------------------------------------
222
- > Type "pass" or describe what's wrong
223
- ------------------------------------------------------------
224
- ```
225
-
226
- Wait for user response (plain text, no AskUserQuestion).
227
-
228
- ---
229
-
230
- ### Step 8: Process Response
231
-
232
- **If response indicates pass:**
233
- - Empty response, "yes", "y", "ok", "pass", "next"
234
-
235
- **If response indicates skip:**
236
- - "skip", "can't test", "n/a"
237
-
238
- **If response is anything else (issue):**
239
- - Treat as issue description
240
- - Infer severity from description (see Severity Inference section)
241
-
242
- For issues, update Tests section:
243
- ```yaml
244
- ### {N}. {name}
245
- expected: {expected}
246
- result: issue
247
- reported: "{verbatim user response}"
248
- severity: {inferred}
249
- ```
250
-
251
- Append to Gaps section:
252
- ```yaml
253
- - test: {N}
254
- truth: "{expected behavior}"
255
- status: failed
256
- reason: "User reported: {verbatim}"
257
- severity: {inferred}
258
- requirement_ref: {if mapped}
259
- ```
260
-
261
- **Auto-create Issue from UAT Gap:**
262
-
263
- When result is "issue", create an issue in `.workflow/issues/issues.jsonl`:
264
- - **ID**: `ISS-{YYYYMMDD}-{NNN}` (auto-increment per day from existing entries)
265
- - **Fields**: `id`, `title` ("UAT: {test.name} - {response}" truncated 100 chars), `status: "registered"`, `priority` (from severity), `severity`, `source: "uat"`, `phase_ref` (if phase-scoped), `gap_ref: test.id`, `description` (expected vs reported), `fix_direction: ""`, `context` (with requirement_ref), `tags: ["uat"]`, `affected_components: []`, `feedback: []`, `issue_history: []`, timestamps, `resolved_at: null`, `resolution: null`
266
- - Back-reference: set `gap.issue_id = issue_id` in the gap YAML entry
267
-
268
- **Batched writes for efficiency:**
269
- Keep results in memory. Write to file only when:
270
- 1. **Issue found** -- Preserve the problem immediately
271
- 2. **Session complete** -- Final write before artifacts
272
- 3. **Checkpoint** -- Every 5 passed tests (safety net for context reset)
273
-
274
- If more tests remain -> Update Current Test, go to Step 7
275
- If no more tests -> Go to Step 10
276
-
277
- ---
278
-
279
- ### Step 9: Resume From File
280
-
281
- Read the full uat.md file.
282
- Find first test with `result: [pending]`.
283
-
284
- Announce progress and continue from pending test.
285
- Update Current Test section with the pending test.
286
- Proceed to Step 7.
287
-
288
- ---
289
-
290
- ### Step 10: Complete Session
291
-
292
- Update uat.md frontmatter: status -> "complete", updated timestamp.
293
-
294
- **Archive previous test result artifacts** before writing: if `test-results.json` or `coverage-report.json` exist in `$OUTPUT_DIR/.tests/`, move them to `$OUTPUT_DIR/.history/{name}-{YYYY-MM-DDTHH-mm-ss}.{ext}`.
295
-
296
- Write `.tests/test-results.json`:
297
- ```json
298
- {
299
- "target": "{phase or scratch ID}",
300
- "completed_at": "{ISO timestamp}",
301
- "results": [
302
- { "id": "T-001", "name": "...", "status": "pass|issue|skipped", "details": "..." }
303
- ],
304
- "summary": { "total": N, "passed": N, "issues": N, "skipped": N }
305
- }
306
- ```
307
-
308
- Write `.tests/coverage-report.json`:
309
- ```json
310
- {
311
- "target": "{phase or scratch ID}",
312
- "generated_at": "{ISO timestamp}",
313
- "requirements_covered": ["SC-001"],
314
- "requirements_uncovered": ["SC-003"],
315
- "coverage_percentage": 66.7
316
- }
317
- ```
318
-
319
- Update index.json with uat results:
320
- ```json
321
- {
322
- "uat": {
323
- "status": "passed|gaps_found",
324
- "test_count": N,
325
- "passed": N,
326
- "gaps": [...]
327
- }
328
- }
329
- ```
330
-
331
- If issues == 0 -> go to Step 13 (report, all pass).
332
- If issues > 0 -> go to Step 11.
333
-
334
- ---
335
-
336
- ### Step 11: Auto-Diagnose
337
-
338
- **Spawn parallel debug agents for gap clusters.**
339
-
340
- 1. **Cluster related gaps**: Group issues by affected component/area.
341
- - Same file/module -> one cluster
342
- - Same feature/flow -> one cluster
343
- - Unrelated -> separate clusters
344
-
345
- 2. **Spawn one debug agent per cluster** (parallel):
346
-
347
- For each cluster, spawn a general-purpose agent with pre-filled symptoms (test ID, expected, reported, severity). Agent investigates source files and returns per gap: `root_cause`, `fix_direction`, `affected_files`, `evidence` (file:line refs). Mode: `symptoms_prefilled`, goal: `find_root_cause`. `run_in_background: false`.
348
-
349
- 3. **Collect results** from all agents.
350
-
351
- **Pass issue_ids to debug context:** gather `issue_id` from each gap in the cluster and include in agent prompt so debug agents can reference/update corresponding issues.
352
-
353
- 4. **Update uat.md** gaps with diagnosis:
354
- ```yaml
355
- - test: {N}
356
- truth: "..."
357
- status: failed
358
- reason: "..."
359
- severity: {inferred}
360
- root_cause: "{diagnosed cause}"
361
- fix_direction: "{suggested approach}"
362
- affected_files: ["{file1}", "{file2}"]
363
- ```
364
-
365
- Proceed to Step 12.
366
-
367
- ---
368
-
369
- ### Step 12: Gap Closure Decision
370
-
371
- If AUTO_FIX is set:
372
- - Skip user prompt, go directly to gap-fix loop.
373
-
374
- If AUTO_FIX is not set:
375
- - Present diagnosis summary and offer options:
376
-
377
- ```
378
- ### Diagnosis Complete
379
-
380
- | Gap | Severity | Root Cause | Fix Direction |
381
- |-----|----------|------------|---------------|
382
- | T-3 | major | Missing null check | Add guard clause |
383
- | T-5 | blocker | Event not cleaned | Add cleanup logic |
384
-
385
- Options:
386
- 1. Auto-fix -- Plan and execute fixes, then re-verify
387
- 2. Debug deep -- Skill({ skill: "quality-debug" }) per issue
388
- 3. Plan fixes -- Skill({ skill: "maestro-plan", args: "{phase} --gaps" })
389
- 4. Manual fix -- Address issues yourself
390
- ```
391
-
392
- | Choice | Action |
393
- |--------|--------|
394
- | 1 / "auto-fix" | Go to gap-fix loop |
395
- | 2 / "debug" | Suggest Skill({ skill: "quality-debug" }) |
396
- | 3 / "plan" | Suggest Skill({ skill: "maestro-plan", args: "{phase} --gaps" }) |
397
- | 4 / "manual" | Done, report results |
398
-
399
- **Gap-fix closure loop:**
400
-
401
- Execute the loop: plan --gaps -> execute -> re-verify.
402
-
403
- 1. Run Skill({ skill: "maestro-plan", args: "{phase} --gaps" }) -- generates fix tasks from gaps
404
- 2. Run Skill({ skill: "maestro-execute", args: "{phase}" }) -- executes fix tasks
405
- 3. Run Skill({ skill: "maestro-execute", args: "{phase}" }) -- re-verify via verification gate
406
-
407
- If re-verify passes: update uat.md gaps as resolved, report success.
408
- If re-verify still has gaps: report remaining gaps, suggest manual intervention.
409
-
410
- **Issue lifecycle updates during gap-fix loop:**
411
- - Before plan --gaps: transition issues `registered` -> `planning`
412
- - Before execute: transition `planning` -> `executing`
413
- - After re-verify: resolved gaps -> `completed` (with resolution "auto-fixed via gap-fix loop"), unresolved -> `failed`
414
-
415
- **Loop limit**: Maximum 2 iterations to prevent infinite loops.
416
-
417
- ---
418
-
419
- ### Step 12.5: UAT Confidence Scoring
420
-
421
- Dimensions (4): scenario_coverage, diagnostic_depth, observation_quality, closure_completeness. Factors (weights): requirements_mapped(.30), observation_specificity(.25), user_validation(.20), diagnostic_depth(.15), consistency(.10). Score at: init (Step 5), per user response (Step 8), after gap-fix loop (Step 12).
422
-
423
- Quality mechanisms: Pressure Pass — >80% pass → ask user to try edge case. Devil's Advocate — >70% first-try pass → challenge scenario difficulty. Stall Detection — 2 gap-fix iterations without improvement → stop.
424
-
425
- Readiness Gate (blocks Step 13): scenario_coverage < 40% | blocker gap without diagnosis | no pressure pass (if >80%) | unresolved gaps without acknowledgment. Append confidence summary to uat.md.
426
-
427
- ---
428
-
429
- ### Step 13: Report
430
-
431
- ```
432
- === UAT RESULTS ===
433
- Target: {target}
434
-
435
- Smoke Tests: {smoke_count} run, {smoke_pass} passed (if ran)
436
- UAT Tests: {total} total
437
- Passed: {passed}
438
- Issues: {issues} ({blocker_count} blockers, {major_count} major)
439
- Skipped: {skipped}
440
-
441
- Diagnosis: {diagnosed_count}/{issues} gaps diagnosed
442
- Auto-fix: {fixed_count} gaps resolved (if ran)
443
-
444
- Files:
445
- {target_dir}/uat.md
446
- {target_dir}/.tests/test-results.json
447
- {target_dir}/.tests/coverage-report.json
448
-
449
- Next steps:
450
- {suggested_next_command}
451
- ```
452
-
453
- **Next step routing:**
454
-
455
- | Result | Suggestion |
456
- |--------|------------|
457
- | All passed, no gaps | Skill({ skill: "maestro-milestone-audit" }) |
458
- | Gaps auto-fixed | Skill({ skill: "maestro-milestone-audit" }) |
459
- | Gaps remain, diagnosed | Skill({ skill: "quality-debug" }) or Skill({ skill: "maestro-plan", args: "--gaps" }) |
460
- | Low coverage | Skill({ skill: "quality-auto-test", args: "{phase}" }) to generate missing tests |
461
-
462
- ---
463
-
464
- ## Severity Inference
465
-
466
- Infer severity from user's natural language:
467
-
468
- | User says | Infer |
469
- |-----------|-------|
470
- | "crashes", "error", "exception", "fails completely", "can't use" | blocker |
471
- | "doesn't work", "nothing happens", "wrong behavior", "broken" | major |
472
- | "works but...", "slow", "weird", "minor issue", "inconsistent" | minor |
473
- | "color", "spacing", "alignment", "looks off", "typo" | cosmetic |
474
-
475
- Default to **major** if unclear. Never ask "how severe is this?" -- just infer and move on.
1
+ # Test Workflow (UAT)
2
+
3
+ Conversational UAT testing with persistent state, auto-diagnosis, and gap-fix closure loop.
4
+
5
+ **Core**: Show expected behavior, ask if reality matches. One test at a time.
6
+ - "yes" / "y" / "next" / empty / "pass" pass
7
+ - "skip" / "can't test" / "n/a" → skipped
8
+ - Anything else logged as issue, severity auto-inferred
9
+
10
+ NEVER ask "how severe is this?"
11
+
12
+ ---
13
+
14
+ ### Step 1: Resolve Target
15
+
16
+ | Input | Action |
17
+ |-------|--------|
18
+ | Phase number (e.g., "3") | `TARGET_TYPE=phase`, resolve from `state.json` artifacts |
19
+ | Scratch task ID | `TARGET_TYPE=scratch`, `SCRATCH_DIR=.workflow/scratch/{id}/` |
20
+ | Nothing | Check active UAT sessions (Step 2), else prompt user |
21
+
22
+ **Flags:** `--smoke` (cold-start smoke tests before UAT), `--auto-fix` (auto gap-fix loop on failures)
23
+
24
+ Validate target exists and has verification.json (E002).
25
+
26
+ ---
27
+
28
+ ### Step 2: Check Active Sessions
29
+
30
+ ```bash
31
+ # Check scratch dirs (resolved via artifact registry) for active UAT sessions
32
+ find .workflow/scratch -name "uat.md" -type f 2>/dev/null | head -5
33
+ ```
34
+
35
+ Read each file's frontmatter (status, target) and Current Test section.
36
+
37
+ **If active sessions exist AND no $ARGUMENTS:**
38
+
39
+ Display inline:
40
+ ```
41
+ ## Active UAT Sessions
42
+
43
+ | # | Target | Status | Current Test | Progress |
44
+ |---|--------|--------|--------------|----------|
45
+ | 1 | 04-comments | testing | 3. Reply to Comment | 2/6 |
46
+ | 2 | quick-fix-nav | testing | 1. Nav Links | 0/4 |
47
+
48
+ Reply with a number to resume, or provide a phase/task to start new.
49
+ ```
50
+
51
+ Wait for user response.
52
+ - Number -> resume that session (go to Step 9: Resume From File)
53
+ - Phase/task ID -> new session (go to Step 4: Find Testables)
54
+
55
+ **If active sessions exist AND $ARGUMENTS provided:**
56
+ Check if session exists for that target. If yes, offer resume or restart.
57
+
58
+ **If no active sessions AND no $ARGUMENTS:**
59
+ Prompt: "No active UAT sessions. Provide a phase number or scratch task ID to start testing."
60
+
61
+ **If no active sessions AND $ARGUMENTS:**
62
+ Continue to Step 3 or Step 4.
63
+
64
+ ---
65
+
66
+ ### Step 3: Run Smoke Tests (if --smoke)
67
+
68
+ Skip if --smoke not set.
69
+
70
+ Inject basic sanity tests BEFORE UAT scenarios:
71
+
72
+ | Smoke Test | Check | Method |
73
+ |------------|-------|--------|
74
+ | App starts | Process runs without crash | `bash: start command, check exit code` |
75
+ | Routes respond | Key endpoints return non-error | `bash: curl/fetch main routes` |
76
+ | Build clean | No build errors | `bash: build command succeeds` |
77
+ | Dependencies | No missing deps | `bash: install check` |
78
+
79
+ Record smoke results in uat.md under `## Smoke Tests` section.
80
+ If any smoke test fails: abort UAT, report as blocker, suggest Skill({ skill: "quality-debug" }). (E003)
81
+
82
+ ---
83
+
84
+ ### Step 4: Load Verification Context
85
+
86
+ Read from target directory: `verification.json`, `validation.json`, `index.json`, `plan.json`, `.summaries/TASK-*.md`.
87
+
88
+ Build testable list from success_criteria + must_haves + task accomplishments (user-observable outcomes only).
89
+
90
+ ---
91
+
92
+ ### Step 5: Design Test Scenarios
93
+
94
+ For each testable item, create a scenario:
95
+ - **id**: T-001, T-002, ...
96
+ - **name**: Brief test name
97
+ - **category**: "e2e" | "integration" | "unit"
98
+ - **expected**: Specific observable behavior (what user should see)
99
+ - **requirement_ref**: Which success criterion this covers
100
+
101
+ Write test-plan.json to `.tests/`:
102
+ ```json
103
+ {
104
+ "target": "{phase or scratch ID}",
105
+ "generated_at": "{ISO timestamp}",
106
+ "tests": [...],
107
+ "coverage": {
108
+ "requirements_mapped": ["SC-001"],
109
+ "requirements_unmapped": ["SC-003"]
110
+ }
111
+ }
112
+ ```
113
+
114
+ ```bash
115
+ mkdir -p "$OUTPUT_DIR/.tests"
116
+ ```
117
+
118
+ Skip internal/non-observable items (refactors, type changes).
119
+
120
+ ---
121
+
122
+ ### Step 6: Create UAT File
123
+
124
+ Archive existing `uat.md` `$OUTPUT_DIR/.history/uat-{YYYY-MM-DDTHH-mm-ss}.md`.
125
+
126
+ Create `$OUTPUT_DIR/uat.md`:
127
+
128
+ ```markdown
129
+ ---
130
+ status: testing
131
+ target: {phase slug or scratch ID}
132
+ source: [list of summary files]
133
+ started: {ISO timestamp}
134
+ updated: {ISO timestamp}
135
+ ---
136
+
137
+ ## Current Test
138
+ <!-- OVERWRITE each test - shows where we are -->
139
+
140
+ number: 1
141
+ name: {first test name}
142
+ expected: |
143
+ {what user should observe}
144
+ awaiting: user response
145
+
146
+ ## Smoke Tests
147
+ {results if ran, otherwise omitted}
148
+
149
+ ## Tests
150
+
151
+ ### 1. {Test Name}
152
+ expected: {observable behavior}
153
+ result: [pending]
154
+
155
+ ### 2. {Test Name}
156
+ expected: {observable behavior}
157
+ result: [pending]
158
+
159
+ ...
160
+
161
+ ## Summary
162
+
163
+ total: {N}
164
+ passed: 0
165
+ issues: 0
166
+ pending: {N}
167
+ skipped: 0
168
+
169
+ ## Gaps
170
+
171
+ [none yet]
172
+ ```
173
+
174
+ Step 7.
175
+
176
+ ---
177
+
178
+ ### Step 7: Present Test
179
+
180
+ Display:
181
+
182
+ ```
183
+ ------------------------------------------------------------
184
+ TEST {number}/{total}: {name}
185
+ ------------------------------------------------------------
186
+
187
+ Expected behavior:
188
+ {expected}
189
+
190
+ ------------------------------------------------------------
191
+ > Type "pass" or describe what's wrong
192
+ ------------------------------------------------------------
193
+ ```
194
+
195
+ Wait for user response (plain text).
196
+
197
+ ---
198
+
199
+ ### Step 8: Process Response
200
+
201
+ | Response | Action |
202
+ |----------|--------|
203
+ | empty / "yes" / "y" / "ok" / "pass" / "next" | Pass |
204
+ | "skip" / "can't test" / "n/a" | Skipped |
205
+ | Anything else | Issue (severity auto-inferred) |
206
+
207
+ For issues, update Tests section:
208
+ ```yaml
209
+ ### {N}. {name}
210
+ expected: {expected}
211
+ result: issue
212
+ reported: "{verbatim user response}"
213
+ severity: {inferred}
214
+ ```
215
+
216
+ Append to Gaps section:
217
+ ```yaml
218
+ - test: {N}
219
+ truth: "{expected behavior}"
220
+ status: failed
221
+ reason: "User reported: {verbatim}"
222
+ severity: {inferred}
223
+ requirement_ref: {if mapped}
224
+ ```
225
+
226
+ **Auto-create Issue from UAT Gap:**
227
+
228
+ Append to `.workflow/issues/issues.jsonl`: `ISS-{YYYYMMDD}-{NNN}`, title "UAT: {test.name} - {response}" (max 100 chars), `source: "uat"`, severity/priority from inference. Back-reference: set `gap.issue_id` in gap YAML.
229
+
230
+ **Write triggers:** 1) Issue found 2) Session complete 3) Every 5 passed tests (checkpoint).
231
+
232
+ More tests Step 7. No more → Step 10.
233
+
234
+ ---
235
+
236
+ ### Step 9: Resume From File
237
+
238
+ Read uat.md find first `result: [pending]` → update Current Test → Step 7.
239
+
240
+ ---
241
+
242
+ ### Step 10: Complete Session
243
+
244
+ Update uat.md: `status: complete`. Archive existing test artifacts → `.history/`.
245
+
246
+ Write `.tests/test-results.json`:
247
+ ```json
248
+ {
249
+ "target": "{phase or scratch ID}",
250
+ "completed_at": "{ISO timestamp}",
251
+ "results": [
252
+ { "id": "T-001", "name": "...", "status": "pass|issue|skipped", "details": "..." }
253
+ ],
254
+ "summary": { "total": N, "passed": N, "issues": N, "skipped": N }
255
+ }
256
+ ```
257
+
258
+ Write `.tests/coverage-report.json`:
259
+ ```json
260
+ {
261
+ "target": "{phase or scratch ID}",
262
+ "generated_at": "{ISO timestamp}",
263
+ "requirements_covered": ["SC-001"],
264
+ "requirements_uncovered": ["SC-003"],
265
+ "coverage_percentage": 66.7
266
+ }
267
+ ```
268
+
269
+ Update index.json with uat results (`status`, `test_count`, `passed`, `gaps`).
270
+
271
+ issues == 0 → Step 13. issues > 0 Step 11.
272
+
273
+ ---
274
+
275
+ ### Step 11: Auto-Diagnose
276
+
277
+ 1. **Cluster gaps** by component/area (same file/module → one cluster, same flow → one cluster)
278
+ 2. **Spawn one debug agent per cluster** (parallel, `run_in_background: false`): pre-filled symptoms, `goal: find_root_cause`. Include `issue_id` refs.
279
+ 3. **Collect results**, update uat.md gaps:
280
+ ```yaml
281
+ - test: {N}
282
+ truth: "..."
283
+ status: failed
284
+ reason: "..."
285
+ severity: {inferred}
286
+ root_cause: "{diagnosed cause}"
287
+ fix_direction: "{suggested approach}"
288
+ affected_files: ["{file1}", "{file2}"]
289
+ ```
290
+
291
+ ---
292
+
293
+ ### Step 12: Gap Closure Decision
294
+
295
+ `AUTO_FIX` set → skip prompt, go to gap-fix loop. Otherwise present:
296
+
297
+ ```
298
+ ### Diagnosis Complete
299
+
300
+ | Gap | Severity | Root Cause | Fix Direction |
301
+ |-----|----------|------------|---------------|
302
+ | T-3 | major | Missing null check | Add guard clause |
303
+ | T-5 | blocker | Event not cleaned | Add cleanup logic |
304
+
305
+ Options:
306
+ 1. Auto-fix -- Plan and execute fixes, then re-verify
307
+ 2. Debug deep -- Skill({ skill: "quality-debug" }) per issue
308
+ 3. Plan fixes -- Skill({ skill: "maestro-plan", args: "{phase} --gaps" })
309
+ 4. Manual fix -- Address issues yourself
310
+ ```
311
+
312
+ | Choice | Action |
313
+ |--------|--------|
314
+ | 1 / "auto-fix" | Go to gap-fix loop |
315
+ | 2 / "debug" | Suggest Skill({ skill: "quality-debug" }) |
316
+ | 3 / "plan" | Suggest Skill({ skill: "maestro-plan", args: "{phase} --gaps" }) |
317
+ | 4 / "manual" | Done, report results |
318
+
319
+ **Gap-fix closure loop** (max 2 iterations):
320
+
321
+ 1. `maestro-plan {phase} --gaps` → fix tasks
322
+ 2. `maestro-execute {phase}` → execute fixes
323
+ 3. `maestro-execute {phase}` → re-verify
324
+
325
+ Issue lifecycle: `registered` → `planning` → `executing` → `completed` | `failed`.
326
+
327
+ Pass → update uat.md gaps as resolved. Still gaps → report remaining, suggest manual intervention.
328
+
329
+ ---
330
+
331
+ ### Step 12.5: UAT Confidence Scoring
332
+
333
+ Dimensions (4): scenario_coverage, diagnostic_depth, observation_quality, closure_completeness. Factors (weights): requirements_mapped(.30), observation_specificity(.25), user_validation(.20), diagnostic_depth(.15), consistency(.10). Score at: init (Step 5), per user response (Step 8), after gap-fix loop (Step 12).
334
+
335
+ Quality mechanisms: Pressure Pass — >80% pass → ask user to try edge case. Devil's Advocate — >70% first-try pass → challenge scenario difficulty. Stall Detection — 2 gap-fix iterations without improvement → stop.
336
+
337
+ Readiness Gate (blocks Step 13): scenario_coverage < 40% | blocker gap without diagnosis | no pressure pass (if >80%) | unresolved gaps without acknowledgment. Append confidence summary to uat.md.
338
+
339
+ ---
340
+
341
+ ### Step 13: Report
342
+
343
+ ```
344
+ === UAT RESULTS ===
345
+ Target: {target}
346
+
347
+ Smoke Tests: {smoke_count} run, {smoke_pass} passed (if ran)
348
+ UAT Tests: {total} total
349
+ Passed: {passed}
350
+ Issues: {issues} ({blocker_count} blockers, {major_count} major)
351
+ Skipped: {skipped}
352
+
353
+ Diagnosis: {diagnosed_count}/{issues} gaps diagnosed
354
+ Auto-fix: {fixed_count} gaps resolved (if ran)
355
+
356
+ Files:
357
+ {target_dir}/uat.md
358
+ {target_dir}/.tests/test-results.json
359
+ {target_dir}/.tests/coverage-report.json
360
+
361
+ Next steps:
362
+ {suggested_next_command}
363
+ ```
364
+
365
+ **Next step routing:**
366
+
367
+ | Result | Suggestion |
368
+ |--------|------------|
369
+ | All passed, no gaps | Skill({ skill: "maestro-milestone-audit" }) |
370
+ | Gaps auto-fixed | Skill({ skill: "maestro-milestone-audit" }) |
371
+ | Gaps remain, diagnosed | Skill({ skill: "quality-debug" }) or Skill({ skill: "maestro-plan", args: "--gaps" }) |
372
+ | Low coverage | Skill({ skill: "quality-auto-test", args: "{phase}" }) to generate missing tests |
373
+
374
+ ---
375
+
376
+ ## Severity Inference
377
+
378
+ | User says | Infer |
379
+ |-----------|-------|
380
+ | "crashes", "error", "exception", "fails completely", "can't use" | blocker |
381
+ | "doesn't work", "nothing happens", "wrong behavior", "broken" | major |
382
+ | "works but...", "slow", "weird", "minor issue", "inconsistent" | minor |
383
+ | "color", "spacing", "alignment", "looks off", "typo" | cosmetic |
384
+
385
+ Default: **major**. NEVER ask severity — infer and move on.