maestro-flow 0.4.6 → 0.4.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (105) hide show
  1. package/.claude/commands/maestro-ralph.md +548 -377
  2. package/.claude/commands/maestro.md +220 -191
  3. package/.codex/skills/maestro/SKILL.md +495 -462
  4. package/.codex/skills/maestro-collab/SKILL.md +218 -117
  5. package/.codex/skills/maestro-execute/SKILL.md +13 -11
  6. package/.codex/skills/maestro-milestone-audit/SKILL.md +12 -10
  7. package/.codex/skills/maestro-ralph/SKILL.md +491 -339
  8. package/.codex/skills/maestro-ui-codify/SKILL.md +18 -16
  9. package/.codex/skills/manage-codebase-rebuild/SKILL.md +20 -13
  10. package/.codex/skills/manage-issue-discover/SKILL.md +19 -17
  11. package/.codex/skills/quality-debug/SKILL.md +35 -31
  12. package/.codex/skills/quality-refactor/SKILL.md +20 -12
  13. package/.codex/skills/quality-review/SKILL.md +21 -17
  14. package/.codex/skills/team-coordinate/SKILL.md +462 -235
  15. package/.codex/skills/team-coordinate/specs/role-catalog.md +132 -0
  16. package/.codex/skills/team-lifecycle-v4/SKILL.md +445 -191
  17. package/.codex/skills/team-quality-assurance/SKILL.md +205 -161
  18. package/.codex/skills/team-review/SKILL.md +198 -159
  19. package/.codex/skills/team-tech-debt/SKILL.md +214 -144
  20. package/.codex/skills/team-testing/SKILL.md +210 -158
  21. package/dashboard/dist-server/dashboard/src/server/agents/claude-code-adapter.js +25 -33
  22. package/dashboard/dist-server/dashboard/src/server/agents/claude-code-adapter.js.map +1 -1
  23. package/dashboard/dist-server/dashboard/src/server/agents/claude-code-adapter.test.js +9 -3
  24. package/dashboard/dist-server/dashboard/src/server/agents/claude-code-adapter.test.js.map +1 -1
  25. package/dashboard/dist-server/dashboard/src/server/agents/codex-app-server-adapter.js +5 -2
  26. package/dashboard/dist-server/dashboard/src/server/agents/codex-app-server-adapter.js.map +1 -1
  27. package/dashboard/dist-server/dashboard/src/server/agents/codex-cli-adapter.js +20 -8
  28. package/dashboard/dist-server/dashboard/src/server/agents/codex-cli-adapter.js.map +1 -1
  29. package/dashboard/dist-server/dashboard/src/server/agents/gemini-a2a-adapter.js +6 -3
  30. package/dashboard/dist-server/dashboard/src/server/agents/gemini-a2a-adapter.js.map +1 -1
  31. package/dashboard/dist-server/dashboard/src/server/agents/gemini-a2a-adapter.test.js +7 -1
  32. package/dashboard/dist-server/dashboard/src/server/agents/gemini-a2a-adapter.test.js.map +1 -1
  33. package/dashboard/dist-server/dashboard/src/server/agents/opencode-adapter.d.ts +2 -0
  34. package/dashboard/dist-server/dashboard/src/server/agents/opencode-adapter.js +40 -15
  35. package/dashboard/dist-server/dashboard/src/server/agents/opencode-adapter.js.map +1 -1
  36. package/dashboard/dist-server/dashboard/src/server/agents/process-tree-kill.d.ts +1 -0
  37. package/dashboard/dist-server/dashboard/src/server/agents/process-tree-kill.js +59 -0
  38. package/dashboard/dist-server/dashboard/src/server/agents/process-tree-kill.js.map +1 -0
  39. package/dashboard/dist-server/dashboard/src/server/agents/process-tree-kill.test.d.ts +1 -0
  40. package/dashboard/dist-server/dashboard/src/server/agents/process-tree-kill.test.js +78 -0
  41. package/dashboard/dist-server/dashboard/src/server/agents/process-tree-kill.test.js.map +1 -0
  42. package/dashboard/dist-server/dashboard/src/server/agents/stale-handler.d.ts +25 -0
  43. package/dashboard/dist-server/dashboard/src/server/agents/stale-handler.js +40 -0
  44. package/dashboard/dist-server/dashboard/src/server/agents/stale-handler.js.map +1 -0
  45. package/dashboard/dist-server/dashboard/src/server/agents/stale-handler.test.d.ts +1 -0
  46. package/dashboard/dist-server/dashboard/src/server/agents/stale-handler.test.js +89 -0
  47. package/dashboard/dist-server/dashboard/src/server/agents/stale-handler.test.js.map +1 -0
  48. package/dashboard/dist-server/dashboard/src/server/agents/stream-json-adapter.js +19 -8
  49. package/dashboard/dist-server/dashboard/src/server/agents/stream-json-adapter.js.map +1 -1
  50. package/dashboard/dist-server/dashboard/src/server/agents/stream-monitor.d.ts +6 -0
  51. package/dashboard/dist-server/dashboard/src/server/agents/stream-monitor.js +7 -1
  52. package/dashboard/dist-server/dashboard/src/server/agents/stream-monitor.js.map +1 -1
  53. package/dashboard/dist-server/dashboard/src/server/agents/stream-monitor.test.d.ts +1 -0
  54. package/dashboard/dist-server/dashboard/src/server/agents/stream-monitor.test.js +46 -0
  55. package/dashboard/dist-server/dashboard/src/server/agents/stream-monitor.test.js.map +1 -0
  56. package/dashboard/dist-server/shared/agent-types.d.ts +6 -0
  57. package/dist/shared/agent-types.d.ts +6 -0
  58. package/dist/shared/agent-types.d.ts.map +1 -1
  59. package/dist/src/agents/cli-agent-runner.d.ts +3 -0
  60. package/dist/src/agents/cli-agent-runner.d.ts.map +1 -1
  61. package/dist/src/agents/cli-agent-runner.js +1 -0
  62. package/dist/src/agents/cli-agent-runner.js.map +1 -1
  63. package/dist/src/commands/delegate.d.ts +2 -0
  64. package/dist/src/commands/delegate.d.ts.map +1 -1
  65. package/dist/src/commands/delegate.js +18 -0
  66. package/dist/src/commands/delegate.js.map +1 -1
  67. package/dist/src/config/cli-tools-config.d.ts +3 -0
  68. package/dist/src/config/cli-tools-config.d.ts.map +1 -1
  69. package/dist/src/config/cli-tools-config.js.map +1 -1
  70. package/package.json +1 -1
  71. package/shared/agent-types.ts +237 -231
  72. package/.codex/skills/team-coordinate/roles/coordinator/commands/analyze-task.md +0 -247
  73. package/.codex/skills/team-coordinate/roles/coordinator/commands/dispatch.md +0 -126
  74. package/.codex/skills/team-coordinate/roles/coordinator/commands/monitor.md +0 -265
  75. package/.codex/skills/team-coordinate/roles/coordinator/role.md +0 -403
  76. package/.codex/skills/team-coordinate/specs/knowledge-transfer.md +0 -113
  77. package/.codex/skills/team-coordinate/specs/pipelines.md +0 -97
  78. package/.codex/skills/team-coordinate/specs/quality-gates.md +0 -112
  79. package/.codex/skills/team-coordinate/specs/role-spec-template.md +0 -192
  80. package/.codex/skills/team-executor/SKILL.md +0 -116
  81. package/.codex/skills/team-executor/roles/executor/commands/monitor.md +0 -213
  82. package/.codex/skills/team-executor/roles/executor/role.md +0 -173
  83. package/.codex/skills/team-executor/specs/session-schema.md +0 -230
  84. package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/analyze.md +0 -56
  85. package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/dispatch.md +0 -61
  86. package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/monitor.md +0 -113
  87. package/.codex/skills/team-lifecycle-v4/roles/coordinator/role.md +0 -189
  88. package/.codex/skills/team-lifecycle-v4/schemas/tasks-schema.md +0 -100
  89. package/.codex/skills/team-lifecycle-v4/specs/knowledge-transfer.md +0 -204
  90. package/.codex/skills/team-quality-assurance/roles/coordinator/commands/analyze.md +0 -72
  91. package/.codex/skills/team-quality-assurance/roles/coordinator/commands/dispatch.md +0 -108
  92. package/.codex/skills/team-quality-assurance/roles/coordinator/commands/monitor.md +0 -163
  93. package/.codex/skills/team-quality-assurance/roles/coordinator/role.md +0 -177
  94. package/.codex/skills/team-review/roles/coordinator/commands/analyze.md +0 -71
  95. package/.codex/skills/team-review/roles/coordinator/commands/dispatch.md +0 -90
  96. package/.codex/skills/team-review/roles/coordinator/commands/monitor.md +0 -135
  97. package/.codex/skills/team-review/roles/coordinator/role.md +0 -176
  98. package/.codex/skills/team-tech-debt/roles/coordinator/commands/analyze.md +0 -47
  99. package/.codex/skills/team-tech-debt/roles/coordinator/commands/dispatch.md +0 -163
  100. package/.codex/skills/team-tech-debt/roles/coordinator/commands/monitor.md +0 -133
  101. package/.codex/skills/team-tech-debt/roles/coordinator/role.md +0 -173
  102. package/.codex/skills/team-testing/roles/coordinator/commands/analyze.md +0 -70
  103. package/.codex/skills/team-testing/roles/coordinator/commands/dispatch.md +0 -106
  104. package/.codex/skills/team-testing/roles/coordinator/commands/monitor.md +0 -156
  105. package/.codex/skills/team-testing/roles/coordinator/role.md +0 -185
@@ -1,339 +1,491 @@
1
- ---
2
- name: maestro-ralph
3
- description: Use when the optimal command sequence is unclear and needs automated state-based determination
4
- argument-hint: "\"intent\" [-y] | status | continue | execute"
5
- allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
6
- ---
7
-
8
- <purpose>
9
- Closed-loop decision engine for the maestro workflow lifecycle.
10
- Coordinator assembles fully-resolved skill calls -> spawns via `spawn_agents_on_csv` ->
11
- delegates evaluation at decision nodes -> dynamically expands/shrinks chain.
12
-
13
- Entry: `"intent"` (new session), `execute`/`continue` (resume), `status` (display).
14
- Two node types: **external** (spawn_agents_on_csv) and **decision** (delegate evaluate).
15
- Session at `.workflow/.maestro/ralph-{YYYYMMDD-HHmmss}/status.json`.
16
- </purpose>
17
-
18
- <context>
19
- $ARGUMENTS -- intent text, flags, or keywords.
20
-
21
- **Parse**: `-y` -> auto_mode. `.md/.txt` path -> input_doc (supplementary, never substitutes lifecycle). Remaining -> intent.
22
-
23
- **`-y` downstream propagation**:
24
-
25
- | Skill | Flag | Effect |
26
- |-------|------|--------|
27
- | maestro-init | `-y` | skip interactive |
28
- | maestro-analyze | `-y` | skip scoping |
29
- | maestro-brainstorm | `-y` | skip questions |
30
- | maestro-roadmap | `-y` | skip choices |
31
- | maestro-plan | `-y` | skip confirmation |
32
- | maestro-execute | `-y` | skip confirmation, auto-continue blocked |
33
- | quality-auto-test | `-y` | skip plan confirmation |
34
- | quality-test | `-y --auto-fix` | auto gap-fix loop |
35
- | maestro-verify | `-y` | skip confirmation |
36
- | quality-review | `-y` | skip confirmation |
37
- | quality-debug | `-y` | skip confirmation |
38
- | maestro-milestone-complete | `-y` | skip knowledge promotion |
39
- | maestro-milestone-audit | `-y` | skip confirmation |
40
-
41
- **State files**: `.workflow/state.json`, `.workflow/roadmap.md`, `.workflow/.maestro/ralph-*/status.json`
42
- </context>
43
-
44
- <invariants>
45
- 1. **ALL external steps via spawn_agents_on_csv** -- coordinator NEVER executes skill logic directly
46
- 2. **Coordinator = prompt assembler** -- classify -> enrich args -> build CSV -> spawn -> read results -> assemble next
47
- 3. **Decision nodes delegate-evaluate** -- `maestro delegate --role analyze`; structural decisions evaluated directly
48
- 4. **Barrier = solo wave** -- analyze, plan, execute, brainstorm, roadmap always run alone
49
- 5. **Non-barriers can parallel** -- consecutive non-barrier, non-decision external steps grouped
50
- 6. **Wave-by-wave** -- never start wave N+1 before wave N results read
51
- 7. **Coordinator owns context** -- sub-agents never read prior results; coordinator assembles full skill_call
52
- 8. **Quality mode governs steps** -- full/standard/quick determines quality stages
53
- 9. **passed_gates skip** -- already-passed gates not re-run (unless code changed)
54
- </invariants>
55
-
56
- <state_machine>
57
-
58
- <states>
59
- S_PARSE_ROUTE -- 解析参数、路由入口点 PERSIST: --
60
- S_STATUS -- 显示 session 进度后结束 PERSIST: --
61
- S_INFER -- 推断生命周期位置 PERSIST: session.lifecycle_position
62
- S_RESOLVE_PHASE -- 解析目标 phase PERSIST: session.phase
63
- S_QUALITY_MODE -- 确定质量模式 full/standard/quick PERSIST: session.quality_mode
64
- S_BUILD_CHAIN -- 构建步骤链 PERSIST: session.steps[]
65
- S_CREATE_SESSION -- 写 status.json PERSIST: session (full)
66
- S_CONFIRM -- 用户确认(auto_mode skip) PERSIST: --
67
- S_LOAD_NEXT -- 找下一个 pending step PERSIST: --
68
- S_WAVE_EXEC -- 构建并执行 wave PERSIST: session.waves[], context
69
- S_DECISION_EVAL -- 评估质量门 PERSIST: --
70
- S_APPLY_VERDICT -- 应用裁决 PERSIST: passed_gates[], retry_count
71
- S_FIX_LOOP -- 插入修复步骤、重索引 PERSIST: session.steps[] (expanded)
72
- S_COMPLETE -- 标记完成 PERSIST: session.status = "completed"
73
- S_PAUSED -- 暂停等待人工 PERSIST: session.status = "paused"
74
- S_FALLBACK -- 请求用户输入 PERSIST: session.status = "paused"
75
- </states>
76
-
77
- <transitions>
78
-
79
- S_PARSE_ROUTE:
80
- -> S_STATUS WHEN: intent == "status"
81
- -> S_LOAD_NEXT WHEN: intent == "execute" | "continue"
82
- -> S_DECISION_EVAL WHEN: running session with decision step in "running"
83
- -> S_INFER WHEN: intent non-empty
84
- -> S_FALLBACK WHEN: no intent AND no running session
85
-
86
- S_STATUS -> END DO: A_SHOW_STATUS
87
-
88
- S_INFER:
89
- -> S_RESOLVE_PHASE WHEN: position resolved DO: A_INFER_POSITION
90
- -> S_FALLBACK WHEN: cannot infer
91
-
92
- S_RESOLVE_PHASE:
93
- -> S_QUALITY_MODE DO: A_RESOLVE_PHASE
94
-
95
- S_QUALITY_MODE:
96
- -> S_BUILD_CHAIN DO: A_DETERMINE_QUALITY_MODE
97
-
98
- S_BUILD_CHAIN:
99
- -> S_CREATE_SESSION DO: A_BUILD_STEPS
100
-
101
- S_CREATE_SESSION:
102
- -> S_CONFIRM WHEN: not auto_mode DO: A_CREATE_SESSION
103
- -> S_LOAD_NEXT WHEN: auto_mode DO: A_CREATE_SESSION
104
-
105
- S_CONFIRM:
106
- -> S_LOAD_NEXT WHEN: "Proceed"
107
- -> S_BUILD_CHAIN WHEN: "Edit"
108
- -> S_QUALITY_MODE WHEN: "Change quality mode"
109
- -> S_PAUSED WHEN: "Cancel"
110
-
111
- S_LOAD_NEXT:
112
- -> S_DECISION_EVAL WHEN: next_step.type == "decision"
113
- -> S_WAVE_EXEC WHEN: next_step.type == "external"
114
- -> S_COMPLETE WHEN: no pending steps
115
-
116
- S_WAVE_EXEC:
117
- -> S_LOAD_NEXT WHEN: success DO: A_BUILD_AND_SPAWN_WAVE
118
- -> S_PAUSED WHEN: failed GUARD: auto_mode retry once then pause
119
-
120
- S_DECISION_EVAL:
121
- -> S_APPLY_VERDICT WHEN: quality-gate DO: A_DELEGATE_EVALUATE
122
- -> S_APPLY_VERDICT WHEN: structural DO: A_STRUCTURAL_EVALUATE
123
-
124
- S_APPLY_VERDICT:
125
- -> S_LOAD_NEXT WHEN: proceed DO: add gate to passed_gates
126
- -> S_FIX_LOOP WHEN: fix DO: clear passed_gates, increment retry
127
- -> S_PAUSED WHEN: escalate
128
- -> S_LOAD_NEXT WHEN: post-milestone + next milestone DO: A_ADVANCE_MILESTONE
129
- -> S_COMPLETE WHEN: post-milestone + no next
130
- -> S_PAUSED WHEN: post-debug-escalate (always, even -y)
131
- GUARD: retry >= max_retries -> force escalate
132
- GUARD: confidence < 60 AND proceed -> override to fix
133
- GUARD: confidence > 95 AND fix AND retry > 0 -> suggest proceed
134
-
135
- S_FIX_LOOP:
136
- -> S_LOAD_NEXT DO: A_INSERT_FIX_LOOP
137
-
138
- S_COMPLETE -> END DO: A_FINALIZE
139
- S_PAUSED -> END DO: A_PAUSE_SESSION
140
- S_FALLBACK -> S_PARSE_ROUTE WHEN: user input | -> END WHEN: cancel
141
-
142
- </transitions>
143
-
144
- <actions>
145
-
146
- ### A_INFER_POSITION
147
-
148
- **Intent-based override**: brainstorm pattern -> position = brainstorm.
149
-
150
- **Bootstrap detection**:
151
-
152
- | Condition | Position |
153
- |-----------|----------|
154
- | No .workflow/ + no source | brainstorm |
155
- | No .workflow/ + has source | init |
156
- | Has .workflow/ but no state.json | init |
157
- | Has state.json | artifact-based inference |
158
-
159
- **Artifact-based inference:** Filter by current_milestone + target phase:
160
-
161
- | Condition | Position |
162
- |-----------|----------|
163
- | no milestones defined or no roadmap.md | `roadmap` |
164
- | no artifacts for target phase | `analyze` |
165
- | latest artifact = analyze | `plan` |
166
- | latest artifact = plan | `execute` |
167
- | latest artifact = execute | `verify` |
168
- | latest artifact = verify | → refine from result files |
169
-
170
- **Refine from verify results:**
171
-
172
- | Condition | Position |
173
- |-----------|----------|
174
- | verification.json: passed==false or gaps[] non-empty | `verify-failed` |
175
- | passed==true, no review.json, has auto-test report | `review` |
176
- | passed==true, no review.json, no auto-test report | `business-test` (full) / `review` (standard/quick) |
177
- | review.json: verdict=="BLOCK" | `review-failed` |
178
- | review.json: verdict!="BLOCK" | `test` |
179
- | uat.md: all passed | `milestone-audit` |
180
- | uat.md: has failures | `test-failed` |
181
-
182
- ### A_RESOLVE_PHASE
183
-
184
- Priority: regex from intent `phase\s*(\d+)` -> latest in-progress artifact's phase -> first incomplete phase -> null (brainstorm/init/roadmap) -> request_user_input if ambiguous.
185
-
186
- ### A_DETERMINE_QUALITY_MODE
187
-
188
- | Condition | Mode | Pipeline |
189
- |-----------|------|----------|
190
- | Has REQ-*.md + phase scope | full | verify -> business-test -> review -> test-gen -> test |
191
- | Default | standard | verify -> review -> test (test-gen if coverage < 80%) |
192
- | User --quality quick | quick | verify -> review --tier quick |
193
-
194
- ### A_BUILD_STEPS
195
-
196
- **Lifecycle stages:**
197
-
198
- | Stage | Skill | Barrier | Quality Mode | Decision after |
199
- |-------|-------|---------|-------------|----------------|
200
- | brainstorm | maestro-brainstorm "{intent}" | yes | all | — |
201
- | init | maestro-init | no | all | |
202
- | roadmap | maestro-roadmap "{intent}" | yes | all | — |
203
- | analyze | maestro-analyze {phase} | yes | all | — |
204
- | plan | maestro-plan {phase} | yes | all | — |
205
- | execute | maestro-execute {phase} | yes | all | — |
206
- | verify | maestro-verify {phase} | no | all | post-verify |
207
- | business-test | quality-auto-test {phase} | no | full only | post-business-test |
208
- | review | quality-review {phase} | no | all (quick: --tier quick) | post-review |
209
- | test-gen | quality-auto-test {phase} | no | full; standard if coverage<80% | |
210
- | test | quality-test {phase} | no | full, standard | post-test |
211
- | milestone-audit | maestro-milestone-audit | no | all | — |
212
- | milestone-complete | maestro-milestone-complete | no | all | post-milestone |
213
-
214
- **Build rules:**
215
- 1. Start from `lifecycle_position`, end at `milestone-complete`
216
- 2. Skip stages with existing completed artifacts (check state.json)
217
- 3. Filter stages by `quality_mode` skip non-applicable stages (see Quality Mode column)
218
- 4. Quick mode: `review` appends `--tier quick`; skips `business-test`, `test-gen`, `test`
219
- 5. Insert decision node after each stage with non-empty Decision column: `{ type: "decision", decision: "<gate>", retry_count: 0, max_retries: 2 }`
220
- 6. Args use placeholders `{phase}`, `{intent}`, `{dirs}` resolved at wave execution time
221
- 7. Append `-y` to all skill args when `auto_mode` is true (see -y propagation table in context)
222
-
223
- ### A_CREATE_SESSION
224
-
225
- 1. Write `.workflow/.maestro/ralph-{YYYYMMDD-HHmmss}/status.json` (see Session JSON Schema)
226
- 2. Initialize tracking:
227
- - `create_goal({ objective: "Ralph lifecycle: {quality_mode} mode, {N} steps from {lifecycle_position}" })`
228
- - `update_plan({ plan: steps.map(step => { step, status: "pending" }) })`
229
-
230
- ### A_BUILD_AND_SPAWN_WAVE
231
-
232
- 1. Conditional step eval: check_coverage -> read validation.json, skip if >= threshold
233
- 2. buildNextWave: barrier -> solo; non-barrier -> batch consecutive; stop at decision
234
- 3. buildSkillCall: resolve {phase}/{intent}/{dirs} placeholders, enrich (plan -> --dir analyze, execute -> --dir plan), append -y if auto_mode
235
- 4. Write wave-{N}.csv (id, skill_call, topic) -> `spawn_agents_on_csv`
236
- 5. Read results -> update step statuses
237
- 6. Barrier context update: analyze->context.analysis_dir, plan->context.plan_dir, execute->context.exec_status, brainstorm->context.brainstorm_dir, roadmap->context.spec_session_id
238
- 7. Persist status.json
239
-
240
- ### A_DELEGATE_EVALUATE
241
-
242
- 1. Resolve result files per decision type (post-verify: verification.json, post-business-test: report.json, post-review: review.json, post-test: uat.md + test-results.json)
243
- 2. Execute `maestro delegate` with analysis prompt -> parse verdict: STATUS (proceed/fix/escalate), REASON, GAP_SUMMARY, CONFIDENCE_SCORE, WEAKEST_DIMENSION
244
- 3. Confidence adjustment: score < 60 + proceed -> fix; score > 95 + fix + retry > 0 -> suggest proceed
245
-
246
- ### A_STRUCTURAL_EVALUATE
247
-
248
- **post-milestone**: Read state.json -> next milestone -> update session (milestone, phase, reset gates), re-infer quality_mode, insert lifecycle steps. No next -> complete.
249
- **post-debug-escalate**: Pause (always, even -y). Display: max retries reached, manual intervention needed.
250
-
251
- ### A_INSERT_FIX_LOOP
252
-
253
- Insert fix template by decision type after current position, reindex:
254
- - **post-verify**: debug -> plan --gaps -> execute -> verify -> decision:post-verify
255
- - **post-business-test**: debug --from-business-test -> plan --gaps -> execute -> verify -> decision:post-verify -> auto-test -> decision:post-business-test
256
- - **post-review**: debug -> plan --gaps -> execute -> verify -> decision:post-verify -> review -> decision:post-review
257
- - **post-test**: debug --from-uat -> plan --gaps -> execute -> verify -> decision:post-verify -> [auto-test + decision:post-business-test (full)] -> review -> decision:post-review -> [auto-test (full; standard if <80%)] -> test -> decision:post-test
258
-
259
- ### A_ADVANCE_MILESTONE
260
-
261
- Update session: milestone, phase, reset passed_gates. Re-infer quality_mode. Build + insert new lifecycle steps for next milestone.
262
-
263
- ### A_FINALIZE
264
-
265
- 1. Set `session.status = "completed"`, write status.json
266
- 2. Sync update_plan: all steps "completed"
267
- 3. `update_goal({ status: "complete" })` release goal constraint
268
- 4. Display completion report
269
-
270
- ### A_PAUSE_SESSION
271
-
272
- 1. Set `session.status = "paused"`, write status.json
273
- 2. Do NOT call `update_goal` goal stays for `execute`/`continue` resume
274
- 3. Display: use `$maestro-ralph execute` to continue
275
-
276
- </actions>
277
-
278
- </state_machine>
279
-
280
- <appendix>
281
-
282
- ### Session JSON Schema
283
-
284
- ```json
285
- {
286
- "session_id": "ralph-{YYYYMMDD-HHmmss}",
287
- "source": "ralph", "intent": "", "status": "running|paused|completed",
288
- "lifecycle_position": "", "phase": null, "milestone": null,
289
- "auto_mode": false, "quality_mode": "standard", "passed_gates": [],
290
- "context": { "issue_id": null, "scratch_dir": null, "plan_dir": null, "analysis_dir": null, "brainstorm_dir": null },
291
- "steps": [{ "index": 0, "type": "external|decision", "skill": "", "args": "", "barrier": false, "status": "pending", "wave_n": null }],
292
- "waves": [], "current_step": 0
293
- }
294
- ```
295
-
296
- ### Worker Contract
297
-
298
- ```
299
- Execute skill_call: {skill_call}. Topic: {topic}.
300
- Do not modify .workflow/.maestro/ status files.
301
- Return: { status, skill_call, summary, artifacts, error }
302
- ```
303
-
304
- ### Wave CSV Schema
305
-
306
- ```csv
307
- id,skill_call,topic
308
- "3","$maestro-verify 1","Ralph step 3/14: verify phase 1"
309
- ```
310
-
311
- Rules: decision nodes NEVER in CSV; barrier -> single-row; non-barrier -> multi-row.
312
-
313
- ### Error Codes
314
-
315
- | Condition | Recovery |
316
- |-----------|----------|
317
- | No intent and no running session | Prompt for intent |
318
- | Cannot infer lifecycle position | Show raw state, ask user |
319
- | Artifact dir not found for decision | Show glob results, ask user |
320
- | Delegate verdict parse failed | Fallback: treat as "fix" |
321
- | Wave timeout | Mark step failed, pause |
322
- | No session for execute/continue | Suggest $maestro-ralph "intent" |
323
-
324
- ### Success Criteria
325
-
326
- - [ ] Lifecycle position inferred from bootstrap + artifact chain + result files
327
- - [ ] Quality mode governs step generation
328
- - [ ] buildSkillCall() with arg enrichment + auto flag
329
- - [ ] Quality-gate decisions delegate-evaluated via maestro delegate --role analyze
330
- - [ ] Confidence-based verdict adjustment applied
331
- - [ ] -y: auto-follow verdict, no STOP (except post-debug-escalate)
332
- - [ ] passed_gates[] tracked, cleared on code changes
333
- - [ ] Fix-loop templates with gap_summary from delegate
334
- - [ ] retry_count per decision, max_retries enforced
335
- - [ ] ALL external steps via spawn_agents_on_csv
336
- - [ ] Barrier solo wave, non-barriers parallel
337
- - [ ] status.json persisted after every wave and decision
338
-
339
- </appendix>
1
+ ---
2
+ name: maestro-ralph
3
+ description: Use when the optimal command sequence is unclear and needs automated state-based determination
4
+ argument-hint: "\"intent\" [-y] | status | continue | execute"
5
+ allowed-tools: Read, Write, Edit, Bash, Glob, Grep, request_user_input
6
+ ---
7
+
8
+ <purpose>
9
+ Closed-loop decision engine for the maestro workflow lifecycle.
10
+ Coordinator infers position -> decomposes intent into a goal-tracked sub-goal checklist ->
11
+ builds chain -> **directly invokes each skill in-context** -> delegates evaluation at
12
+ decision nodes -> dynamically grows the step chain until all sub-goals converge.
13
+
14
+ Entry: `"intent"` (new session), `execute`/`continue` (resume), `status` (display).
15
+ Node types: **skill** (direct in-context invocation) and **decision** (delegate / structural evaluate).
16
+ Session at `.workflow/.maestro/ralph-{YYYYMMDD-HHmmss}/status.json`.
17
+
18
+ Codex specifics:
19
+ - **No agent spawning** — skills run directly in coordinator context, sequentially, one step at a time.
20
+ - **Goal created via built-in tool** — `create_goal` binds the decomposed sub-goal checklist as a
21
+ hard objective; `update_plan` mirrors steps; `update_goal` releases on convergence. Codex
22
+ registers the goal itself via `create_goal`.
23
+ </purpose>
24
+
25
+ <context>
26
+ $ARGUMENTS -- intent text, flags, or keywords.
27
+
28
+ **Parse**: `-y` -> auto_mode. `.md/.txt` path -> input_doc (supplementary, never substitutes lifecycle). Remaining -> intent.
29
+
30
+ **`-y` downstream propagation**:
31
+
32
+ | Skill | Flag | Effect |
33
+ |-------|------|--------|
34
+ | maestro-init | `-y` | skip interactive |
35
+ | maestro-analyze | `-y` | skip scoping |
36
+ | maestro-brainstorm | `-y` | skip questions |
37
+ | maestro-roadmap | `-y` | skip choices |
38
+ | maestro-plan | `-y` | skip confirmation |
39
+ | maestro-execute | `-y` | skip confirmation, auto-continue blocked |
40
+ | quality-auto-test | `-y` | skip plan confirmation |
41
+ | quality-test | `-y --auto-fix` | auto gap-fix loop |
42
+ | maestro-verify | `-y` | skip confirmation |
43
+ | quality-review | `-y` | skip confirmation |
44
+ | quality-debug | `-y` | skip confirmation |
45
+ | maestro-milestone-complete | `-y` | skip knowledge promotion |
46
+ | maestro-milestone-audit | `-y` | skip confirmation |
47
+
48
+ **State files**: `.workflow/state.json`, `.workflow/roadmap.md`, `.workflow/.maestro/ralph-*/status.json`
49
+ </context>
50
+
51
+ <invariants>
52
+ 1. **Skills invoked DIRECTLY in-context** coordinator runs `$skill {resolved_args}` itself, sequentially. NO spawn_agents_on_csv, NO wave, NO worker CSV.
53
+ 2. **Coordinator owns the loop** infer -> decompose -> build -> for each step: resolve args -> invoke skill -> read result -> persist -> next.
54
+ 3. **Decision nodes evaluate, never execute** — quality-gate/goal-gate via `maestro delegate --role analyze`; structural decisions evaluated directly.
55
+ 4. **Goal is tool-created, not prompt-emitted** — `A_DECOMPOSE_TASKS` calls `create_goal` with the sub-goal checklist as success criteria. `update_goal` on full convergence; never released while sub-goals unmet.
56
+ 5. **task_decomposition drives DYNAMIC step growth** — sub-goals are the convergence spec; `steps[]` is a living array. `post-goal-audit` re-checks the checklist and **inserts scoped skill steps** for every unmet sub-goal (same insert+reindex+retry mechanism as fix-loops). Decomposition feeds adaptive branching, never freezes a plan.
57
+ 6. **Status JSON: schema-additive + step-dynamic** — decomposition fields (`boundary_contract`, `execution_criteria`, `task_decomposition`, `goal_checklist_path`) are OPTIONAL; absent = old behavior. `steps[]` grows at runtime via decisions; `goal_ref` traces dynamically-added steps. Never remove/rename existing fields.
58
+ 7. **Sequential execution** — one step at a time in index order; no parallelism (no spawning). Each step's result read before the next starts.
59
+ 8. **Quality mode governs steps** — full/standard/quick determines quality stages.
60
+ 9. **passed_gates skip** already-passed gates not re-run unless code changed.
61
+ </invariants>
62
+
63
+ <state_machine>
64
+
65
+ <states>
66
+ S_PARSE_ROUTE -- 解析参数、路由入口点 PERSIST: --
67
+ S_STATUS -- 显示 session 进度后结束 PERSIST: --
68
+ S_INFER -- 推断生命周期位置 PERSIST: session.lifecycle_position
69
+ S_RESOLVE_PHASE -- 解析目标 phase PERSIST: session.phase
70
+ S_QUALITY_MODE -- 确定质量模式 full/standard/quick PERSIST: session.quality_mode
71
+ S_DECOMPOSE -- 边界澄清、写执行准则+子目标、建 goal PERSIST: session.boundary_contract, .execution_criteria, .task_decomposition
72
+ S_BUILD_CHAIN -- 构建步骤链 PERSIST: session.steps[]
73
+ S_CREATE_SESSION -- 写 status.json PERSIST: session (full)
74
+ S_CONFIRM -- 用户确认(auto_mode skip) PERSIST: --
75
+ S_LOAD_NEXT -- 找下一个 pending step PERSIST: --
76
+ S_STEP_EXEC -- 直接调用 skill 执行该 step PERSIST: session.steps[], context
77
+ S_DECISION_EVAL -- 评估质量门 / 目标门 PERSIST: --
78
+ S_APPLY_VERDICT -- 应用裁决 PERSIST: passed_gates[], retry_count
79
+ S_FIX_LOOP -- 插入修复步骤、重索引 PERSIST: session.steps[] (expanded)
80
+ S_COMPLETE -- 标记完成 PERSIST: session.status = "completed"
81
+ S_PAUSED -- 暂停等待人工 PERSIST: session.status = "paused"
82
+ S_FALLBACK -- 请求用户输入 PERSIST: session.status = "paused"
83
+ </states>
84
+
85
+ <transitions>
86
+
87
+ S_PARSE_ROUTE:
88
+ -> S_STATUS WHEN: intent == "status"
89
+ -> S_LOAD_NEXT WHEN: intent == "execute" | "continue"
90
+ -> S_DECISION_EVAL WHEN: running session with decision step in "running"
91
+ -> S_INFER WHEN: intent non-empty
92
+ -> S_FALLBACK WHEN: no intent AND no running session
93
+
94
+ S_STATUS -> END DO: A_SHOW_STATUS
95
+
96
+ S_INFER:
97
+ -> S_RESOLVE_PHASE WHEN: position resolved DO: A_INFER_POSITION
98
+ -> S_FALLBACK WHEN: cannot infer
99
+
100
+ S_RESOLVE_PHASE:
101
+ -> S_QUALITY_MODE DO: A_RESOLVE_PHASE
102
+
103
+ S_QUALITY_MODE:
104
+ -> S_DECOMPOSE DO: A_DETERMINE_QUALITY_MODE
105
+
106
+ S_DECOMPOSE:
107
+ -> S_BUILD_CHAIN DO: A_DECOMPOSE_TASKS
108
+ GUARD: broad intent (重构/全面/重写/迁移/overhaul/migrate/rewrite) -> MUST clarify boundary even if auto_mode
109
+ GUARD: narrow intent (single file/function/bug) -> auto-derive, skip questions
110
+ GUARD: position in {brainstorm, init} -> skip decomposition (no concrete target yet)
111
+
112
+ S_BUILD_CHAIN:
113
+ -> S_CREATE_SESSION DO: A_BUILD_STEPS
114
+
115
+ S_CREATE_SESSION:
116
+ -> S_CONFIRM WHEN: not auto_mode DO: A_CREATE_SESSION
117
+ -> S_LOAD_NEXT WHEN: auto_mode DO: A_CREATE_SESSION
118
+
119
+ S_CONFIRM:
120
+ -> S_LOAD_NEXT WHEN: "Proceed"
121
+ -> S_BUILD_CHAIN WHEN: "Edit"
122
+ -> S_QUALITY_MODE WHEN: "Change quality mode"
123
+ -> S_PAUSED WHEN: "Cancel"
124
+
125
+ S_LOAD_NEXT:
126
+ -> S_DECISION_EVAL WHEN: next_step.type == "decision"
127
+ -> S_STEP_EXEC WHEN: next_step.type == "skill"
128
+ -> S_COMPLETE WHEN: no pending steps
129
+
130
+ S_STEP_EXEC:
131
+ -> S_LOAD_NEXT WHEN: success DO: A_EXEC_STEP
132
+ -> S_PAUSED WHEN: failed GUARD: auto_mode retry once then pause
133
+
134
+ S_DECISION_EVAL:
135
+ -> S_APPLY_VERDICT WHEN: quality-gate DO: A_DELEGATE_EVALUATE
136
+ -> S_APPLY_VERDICT WHEN: goal-gate DO: A_GOAL_AUDIT_EVALUATE
137
+ -> S_APPLY_VERDICT WHEN: structural DO: A_STRUCTURAL_EVALUATE
138
+
139
+ S_APPLY_VERDICT:
140
+ -> S_LOAD_NEXT WHEN: proceed DO: add gate to passed_gates
141
+ -> S_LOAD_NEXT WHEN: post-goal-audit + all sub-goals met DO: A_APPLY_GOAL_DONE
142
+ -> S_LOAD_NEXT WHEN: post-goal-audit + unmet sub-goals DO: A_APPLY_GOAL_FIX
143
+ -> S_FIX_LOOP WHEN: fix DO: clear passed_gates, increment retry
144
+ -> S_PAUSED WHEN: escalate
145
+ -> S_LOAD_NEXT WHEN: post-milestone + next milestone DO: A_ADVANCE_MILESTONE
146
+ -> S_COMPLETE WHEN: post-milestone + no next
147
+ -> S_PAUSED WHEN: post-debug-escalate (always, even -y)
148
+ GUARD: retry >= max_retries -> force escalate
149
+ GUARD: confidence < 60 AND proceed -> override to fix
150
+ GUARD: confidence > 95 AND fix AND retry > 0 -> suggest proceed
151
+
152
+ S_FIX_LOOP:
153
+ -> S_LOAD_NEXT DO: A_INSERT_FIX_LOOP
154
+
155
+ S_COMPLETE -> END DO: A_FINALIZE
156
+ S_PAUSED -> END DO: A_PAUSE_SESSION
157
+ S_FALLBACK -> S_PARSE_ROUTE WHEN: user input | -> END WHEN: cancel
158
+
159
+ </transitions>
160
+
161
+ <actions>
162
+
163
+ ### A_INFER_POSITION
164
+
165
+ **Intent-based override**: brainstorm pattern -> position = brainstorm.
166
+
167
+ **Bootstrap detection**:
168
+
169
+ | Condition | Position |
170
+ |-----------|----------|
171
+ | No .workflow/ + no source | brainstorm |
172
+ | No .workflow/ + has source | init |
173
+ | Has .workflow/ but no state.json | init |
174
+ | Has state.json | artifact-based inference |
175
+
176
+ **Artifact-based inference:** Filter by current_milestone + target phase:
177
+
178
+ | Condition | Position |
179
+ |-----------|----------|
180
+ | no milestones defined or no roadmap.md | `roadmap` |
181
+ | no artifacts for target phase | `analyze` |
182
+ | latest artifact = analyze | `plan` |
183
+ | latest artifact = plan | `execute` |
184
+ | latest artifact = execute | `verify` |
185
+ | latest artifact = verify | → refine from result files |
186
+
187
+ **Refine from verify results:**
188
+
189
+ | Condition | Position |
190
+ |-----------|----------|
191
+ | verification.json: passed==false or gaps[] non-empty | `verify-failed` |
192
+ | passed==true, no review.json, has auto-test report | `review` |
193
+ | passed==true, no review.json, no auto-test report | `business-test` (full) / `review` (standard/quick) |
194
+ | review.json: verdict=="BLOCK" | `review-failed` |
195
+ | review.json: verdict!="BLOCK" | `test` |
196
+ | uat.md: all passed | `milestone-audit` |
197
+ | uat.md: has failures | `test-failed` |
198
+
199
+ ### A_RESOLVE_PHASE
200
+
201
+ Priority: regex from intent `phase\s*(\d+)` -> latest in-progress artifact's phase -> first incomplete phase -> null (brainstorm/init/roadmap) -> request_user_input if ambiguous.
202
+
203
+ ### A_DETERMINE_QUALITY_MODE
204
+
205
+ | Condition | Mode | Pipeline |
206
+ |-----------|------|----------|
207
+ | Has REQ-*.md + phase scope | full | verify -> business-test -> review -> test-gen -> test |
208
+ | Default | standard | verify -> review -> test (test-gen if coverage < 80%) |
209
+ | User --quality quick | quick | verify -> review --tier quick |
210
+
211
+ ### A_DECOMPOSE_TASKS
212
+
213
+ Build the boundary contract + outcome sub-goal checklist, then **register it as a Codex goal via the built-in tool**. Runs once at session creation, before chain build. Skipped when position in {brainstorm, init}.
214
+
215
+ **1. Classify intent breadth:**
216
+
217
+ | Pattern | Breadth | Clarify? |
218
+ |---------|---------|----------|
219
+ | 重构/全面/重写/重做/整体/迁移 · overhaul/migrate/rewrite/revamp | broad | MUST (ignores auto_mode) |
220
+ | named single file/function/bug, "fix X", "add Y to Z" | narrow | skip — auto-derive |
221
+ | otherwise | medium | clarify unless auto_mode |
222
+
223
+ **2. Clarify boundary** (broad/medium) — `request_user_input`, ≤3 rounds, options pre-filled from intent + a quick Glob/Grep scan of the target module:
224
+
225
+ | Round | Question | Drives |
226
+ |-------|----------|--------|
227
+ | Scope | 哪些目录/文件/层在范围内?明确排除什么? | boundary_contract.in_scope / out_of_scope |
228
+ | Constraints | 必须向后兼容?公共 API 冻结?行为/性能预算?测试门槛? | boundary_contract.constraints + execution_criteria |
229
+ | Done | 什么可观测结果算"完成"?(如:测试全绿 + 行为零变更 + X 指标) | boundary_contract.definition_of_done |
230
+
231
+ narrow → derive defaults from intent + codebase, skip questions.
232
+
233
+ **3. Derive `execution_criteria`** (执行准则 3-6 short imperative rules every step obeys): backward-compat stance, scope-freeze ("只改请求范围"), test/coverage bar, fix-don't-hide, incremental commit.
234
+
235
+ **4. Derive `task_decomposition`** (子目标清单 — outcome-oriented, NOT lifecycle stages). Each entry:
236
+ ```json
237
+ { "id": "G1", "goal": "<deliverable>", "boundary": "<in/out note>",
238
+ "done_when": "<objectively checkable condition>",
239
+ "evidence": "verification.json|review.json|uat.md|<test path>",
240
+ "lifecycle": ["analyze","execute","verify"], "status": "pending" }
241
+ ```
242
+ **Cleverness rule**: `done_when` MUST be objectively verifiable and SHOULD reference an artifact ralph already produces, so `post-goal-audit` can re-verify after context loss. Map each sub-goal to the lifecycle phase(s) producing its evidence — the existing pipeline becomes the machinery that satisfies the goals.
243
+
244
+ **5. Persist** (additive) into session: `boundary_contract`, `execution_criteria`, `task_decomposition`, `goal_checklist_path` = `{session_dir}/goal-checklist.md`. Write the checklist file (see Appendix: Goal Checklist Template).
245
+
246
+ **6. Register goal via `create_goal`:**
247
+ ```
248
+ create_goal({
249
+ objective: "Ralph: {intent} converge all {N} sub-goals within boundary",
250
+ success_criteria: task_decomposition.map(g => `${g.id}: ${g.done_when}`),
251
+ constraints: [...execution_criteria, "stay within boundary_contract"]
252
+ })
253
+ ```
254
+ Goal stays bound until A_APPLY_GOAL_DONE / A_FINALIZE calls `update_goal`. Skipped feature → no create_goal beyond the default lifecycle goal in A_CREATE_SESSION.
255
+
256
+ ### A_BUILD_STEPS
257
+
258
+ **Lifecycle stages:**
259
+
260
+ | Stage | Skill | Barrier | Quality Mode | Decision after |
261
+ |-------|-------|---------|-------------|----------------|
262
+ | brainstorm | maestro-brainstorm "{intent}" | yes | all | — |
263
+ | init | maestro-init | no | all | — |
264
+ | roadmap | maestro-roadmap "{intent}" | yes | all | — |
265
+ | analyze | maestro-analyze {phase} | yes | all | — |
266
+ | plan | maestro-plan {phase} | yes | all | |
267
+ | execute | maestro-execute {phase} | yes | all | — |
268
+ | verify | maestro-verify {phase} | no | all | post-verify |
269
+ | business-test | quality-auto-test {phase} | no | full only | post-business-test |
270
+ | review | quality-review {phase} | no | all (quick: --tier quick) | post-review |
271
+ | test-gen | quality-auto-test {phase} | no | full; standard if coverage<80% | — |
272
+ | test | quality-test {phase} | no | full, standard | post-test |
273
+ | milestone-audit | maestro-milestone-audit | no | all | |
274
+ | goal-audit | *(decision-only, no skill)* | — | all (only if decomposed) | post-goal-audit |
275
+ | milestone-complete | maestro-milestone-complete | no | all | post-milestone |
276
+
277
+ **Build rules:**
278
+ 1. Start from `lifecycle_position`, end at `milestone-complete`
279
+ 2. Skip stages with existing completed artifacts (check state.json)
280
+ 3. Filter stages by `quality_mode` — skip non-applicable stages (see Quality Mode column)
281
+ 4. Quick mode: `review` appends `--tier quick`; skips `business-test`, `test-gen`, `test`
282
+ 5. Insert decision node after each stage with non-empty Decision column: `{ type: "decision", decision: "<gate>", retry_count: 0, max_retries: 2 }`
283
+ 6. **If `task_decomposition` present**: insert a `goal-audit` pure-decision node (no skill) immediately before `milestone-complete` → `decision:post-goal-audit`
284
+ 7. Step type for runnable stages = `"skill"` (executed directly in-context, no spawn). `barrier` field retained as optional metadata only — execution is sequential regardless
285
+ 8. Args use placeholders `{phase}`, `{intent}`, `{dirs}` — resolved at step execution time
286
+ 9. Append `-y` to all skill args when `auto_mode` is true (see -y propagation table in context)
287
+ 10. Dynamically-inserted steps (by `post-goal-audit` / fix-loops) carry optional `goal_ref: "G{n}"`
288
+
289
+ ### A_CREATE_SESSION
290
+
291
+ 1. Write `.workflow/.maestro/ralph-{YYYYMMDD-HHmmss}/status.json` (see Session JSON Schema) include decomposition fields only if produced (additive)
292
+ 2. Initialize tracking:
293
+ - If decomposed: goal already registered by A_DECOMPOSE_TASKS. Else: `create_goal({ objective: "Ralph lifecycle: {quality_mode} mode, {N} steps from {lifecycle_position}" })`
294
+ - `update_plan({ plan: steps.map(step => ({ step, status: "pending" })) })`
295
+ 3. If decomposed: display sub-goal checklist summary (path + G-ids + done_when). Display chain overview.
296
+
297
+ ### A_EXEC_STEP
298
+
299
+ Direct in-context skill invocation — **replaces the old spawn/wave/CSV mechanism**.
300
+
301
+ 1. Conditional step eval: `check_coverage` read validation.json, skip step if ≥ threshold
302
+ 2. Resolve `{phase}` / `{intent}` / `{dirs}` placeholders; arg enrichment (plan → `--dir {analysis_dir}`, execute → `--dir {plan_dir}`); append `-y` if auto_mode
303
+ 3. Mark step `status="running"`, persist status.json + `update_plan` (this step → in_progress)
304
+ 4. **Invoke the skill directly**: execute `$skill {resolved_args}` in the coordinator's own context (NO spawn_agents_on_csv, NO worker). Read its produced artifacts directly.
305
+ 5. On success: capture summary + artifacts; mark step `status="done"`; update context (analyze→analysis_dir, plan→plan_dir, execute→exec_status, brainstorm→brainstorm_dir, roadmap→spec_session_id)
306
+ 6. On failure: mark `status="failed"`; auto_mode → retry once → still failed → S_PAUSED
307
+ 7. Persist status.json + `update_plan` after every step
308
+
309
+ ### A_DELEGATE_EVALUATE
310
+
311
+ 1. Resolve result files per decision type (post-verify: verification.json, post-business-test: report.json, post-review: review.json, post-test: uat.md + test-results.json)
312
+ 2. Execute `maestro delegate` with analysis prompt → parse verdict: STATUS (proceed/fix/escalate), REASON, GAP_SUMMARY, CONFIDENCE_SCORE, WEAKEST_DIMENSION
313
+ 3. Confidence adjustment: score < 60 + proceed → fix; score > 95 + fix + retry > 0 → suggest proceed
314
+
315
+ ### A_GOAL_AUDIT_EVALUATE
316
+
317
+ Re-checks the goal-checklist and decides whether `steps[]` must dynamically grow. Only runs when `task_decomposition` present.
318
+
319
+ 1. Read `session.task_decomposition` + `goal_checklist_path`
320
+ 2. For each sub-goal `status != "done"`: resolve its `evidence` artifact under the current phase scratch dir
321
+ 3. Delegate read-only audit:
322
+ ```
323
+ maestro delegate "PURPOSE: 审计子目标达成情况,决定是否需要补充执行步骤
324
+ TASK: 逐个读取未完成子目标的 evidence 产物 | 对照 done_when 判定 met/unmet | 给出每个 unmet 子目标的差距与目标 phase
325
+ CONTEXT: @{goal_checklist_path} @{evidence artifacts} | 执行准则: {execution_criteria} | 边界: {boundary_contract}
326
+ EXPECTED: ---VERDICT--- STATUS(all_met|has_unmet) / UNMET=[{id:G2,gap:'...',target_phase:execute}] / CONFIDENCE_SCORE(0-100) ---END---
327
+ CONSTRAINTS: 只评估不修改 | 严格按 done_when 判定 | 不得超出 boundary_contract"
328
+ --role analyze --mode analysis
329
+ ```
330
+ 4. On result: parse UNMET. For each met sub-goal → set `task_decomposition[i].status="done"` + flip `[ ]→[x]` in goal-checklist.md
331
+ 5. Verdict: `all_met` A_APPLY_GOAL_DONE; `has_unmet` A_APPLY_GOAL_FIX
332
+ GUARD: retry_count >= max_retries AND still unmet → escalate (insert quality-debug, S_PAUSED for human)
333
+
334
+ ### A_STRUCTURAL_EVALUATE
335
+
336
+ **post-milestone**: Read state.json next milestone → update session (milestone, phase, reset gates), re-infer quality_mode, insert lifecycle steps. No next → complete.
337
+ **post-debug-escalate**: Pause (always, even -y). Display: max retries reached, manual intervention needed.
338
+
339
+ ### A_INSERT_FIX_LOOP
340
+
341
+ Insert fix template by decision type after current position, reindex:
342
+ - **post-verify**: debug → plan --gaps → execute → verify → decision:post-verify
343
+ - **post-business-test**: debug --from-business-test → plan --gaps → execute → verify → decision:post-verify → auto-test → decision:post-business-test
344
+ - **post-review**: debug → plan --gaps → execute → verify → decision:post-verify → review → decision:post-review
345
+ - **post-test**: debug --from-uat → plan --gaps → execute → verify → decision:post-verify → [auto-test + decision:post-business-test (full)] → review → decision:post-review → [auto-test (full; standard if <80%)] → test → decision:post-test
346
+
347
+ ### A_APPLY_GOAL_FIX
348
+
349
+ **Dynamic step-growth core.** For every unmet sub-goal, inject scoped skill steps so `steps[]` grows toward convergence:
350
+
351
+ 1. For each `unmet` sub-goal `G{n}` (grouped by `target_phase` to avoid duplicate runs), insert before the `goal-audit` node a scoped mini-loop (see Appendix: Fix-Loop Templates → post-goal-audit), each inserted step tagged `goal_ref: "G{n}"`, type `"skill"`
352
+ 2. Re-append a fresh `decision:post-goal-audit {retry+1}` after inserted steps (re-loops until all met or max retries)
353
+ 3. Reindex steps, increment retry_count, persist status.json + `update_plan` (steps[] grew)
354
+ 4. Display: ◆ Goal audit: {k} sub-goals unmet → +{N} steps inserted (G{ids}), retry {r}/{max}
355
+
356
+ ### A_APPLY_GOAL_DONE
357
+
358
+ 1. Set all `task_decomposition[*].status="done"`, persist status.json
359
+ 2. Append `ALL_GOALS_DONE` sentinel to goal-checklist.md
360
+ 3. `update_goal({ status: "complete" })` — releases the decomposition goal constraint
361
+ 4. Mark goal-audit decision completed; proceed to `milestone-complete`
362
+ 5. Display: ◆ Goal audit: 全部子目标达成 ✓ — goal 已释放
363
+
364
+ ### A_ADVANCE_MILESTONE
365
+
366
+ Update session: milestone, phase, reset passed_gates. Re-infer quality_mode. Build + insert new lifecycle steps for next milestone (re-append goal-audit before milestone-complete if decomposed).
367
+
368
+ ### A_FINALIZE
369
+
370
+ 1. Set `session.status = "completed"`, write status.json
371
+ 2. Sync `update_plan`: all steps → "completed"
372
+ 3. `update_goal({ status: "complete" })` — release goal constraint (idempotent if already released by A_APPLY_GOAL_DONE)
373
+ 4. Display completion report
374
+
375
+ ### A_PAUSE_SESSION
376
+
377
+ 1. Set `session.status = "paused"`, write status.json
378
+ 2. Do NOT call `update_goal` — goal stays bound for `execute`/`continue` resume
379
+ 3. Display: use `$maestro-ralph execute` to continue
380
+
381
+ ### A_SHOW_STATUS
382
+
383
+ 1. Find latest ralph session
384
+ 2. Display: Session, Status, Position, Quality mode, Progress, Current step
385
+ 3. List steps: [✓] done, [▸] running, [ ] pending, [◆] decision (with goal_ref if set)
386
+ 4. If `task_decomposition` present: show `Sub-goals: {done}/{total}` + unmet G-ids (graceful skip if absent — backward compat)
387
+
388
+ </actions>
389
+
390
+ </state_machine>
391
+
392
+ <appendix>
393
+
394
+ ### Session JSON Schema
395
+
396
+ ```json
397
+ {
398
+ "session_id": "ralph-{YYYYMMDD-HHmmss}",
399
+ "source": "ralph", "intent": "", "status": "running|paused|completed",
400
+ "lifecycle_position": "", "phase": null, "milestone": null,
401
+ "auto_mode": false, "quality_mode": "standard", "passed_gates": [],
402
+ "context": { "issue_id": null, "scratch_dir": null, "plan_dir": null, "analysis_dir": null, "brainstorm_dir": null },
403
+ "steps": [{ "index": 0, "type": "skill|decision", "skill": "", "args": "", "barrier": false, "status": "pending", "goal_ref": null }],
404
+ "waves": [], "current_step": 0,
405
+
406
+ "_comment": "↓ OPTIONAL additive decomposition block. Absent → no decomposition; readers MUST tolerate missing keys. Never remove/rename above fields.",
407
+ "boundary_contract": { "in_scope": [], "out_of_scope": [], "constraints": [], "definition_of_done": "" },
408
+ "execution_criteria": [],
409
+ "task_decomposition": [
410
+ { "id": "G1", "goal": "", "boundary": "", "done_when": "", "evidence": "", "lifecycle": [], "status": "pending|done" }
411
+ ],
412
+ "goal_checklist_path": ""
413
+ }
414
+ ```
415
+
416
+ > **Extensibility contract (two dimensions)**:
417
+ > 1. **Schema-additive** — decomposition block fields optional; absence = old behavior.
418
+ > 2. **Step-dynamic** — `steps[]` is a living array: `post-goal-audit` (and fix/escalate/milestone decisions) **append/reindex steps at runtime** until sub-goals converge. The JSON "extends" primarily by growing `steps[]`, not by freezing a plan. `goal_ref` (optional, default null) traces dynamically-added steps to the spawning sub-goal. `waves` retained as empty array for backward-compat (no longer populated — spawning removed).
419
+
420
+ ### Goal Checklist Template
421
+
422
+ Written to `{session_dir}/goal-checklist.md`. Stable within the session; never renamed (so the registered `create_goal` success criteria stay traceable).
423
+
424
+ ```markdown
425
+ # Ralph Goal Checklist — {session_id}
426
+ > Intent: {intent}
427
+
428
+ ## 执行准则 / Execution Criteria
429
+ - {criterion 1}
430
+ - {criterion 2}
431
+
432
+ ## 边界契约 / Boundary Contract
433
+ - In scope: {in_scope}
434
+ - Out of scope: {out_of_scope}
435
+ - Constraints: {constraints}
436
+ - Definition of Done: {definition_of_done}
437
+
438
+ ## 子目标 / Sub-goals
439
+ - [ ] G1: {goal} — done when: {done_when} (evidence: {evidence})
440
+ - [ ] G2: {goal} — done when: {done_when} (evidence: {evidence})
441
+
442
+ <!-- A_GOAL_AUDIT_EVALUATE flips [ ]→[x] when evidence confirms;
443
+ A_APPLY_GOAL_DONE appends `ALL_GOALS_DONE` once all [x]. -->
444
+ ```
445
+
446
+ ### Fix-Loop Templates
447
+
448
+ - **post-verify**: `$quality-debug "{gap}"` → `$maestro-plan --gaps {phase}` → `$maestro-execute {phase}` → `$maestro-verify {phase}` → decision:post-verify {retry+1}
449
+ - **post-business-test**: debug --from-business-test → plan --gaps → execute → verify → decision:post-verify {0} → auto-test → decision:post-business-test {retry+1}
450
+ - **post-review**: debug → plan --gaps → execute → review → decision:post-review {retry+1}
451
+ - **post-test**: debug --from-uat → plan --gaps → execute → verify → decision:post-verify {0} → auto-test → decision:post-business-test {0} → review → decision:post-review {0} → auto-test → test → decision:post-test {retry+1}
452
+ - **post-goal-audit** (per unmet sub-goal group — dynamically grows steps[]):
453
+ ```
454
+ # for each unmet G{n}, scoped to its target_phase:
455
+ $maestro-plan --gaps {target_phase} "G{n}: {gap}" [goal_ref: G{n}]
456
+ $maestro-execute {target_phase} [goal_ref: G{n}]
457
+ $maestro-verify {target_phase} [goal_ref: G{n}]
458
+ # after all unmet groups inserted:
459
+ decision:post-goal-audit {retry+1}
460
+ ```
461
+ Only unmet sub-goals' phases re-run (no full-pipeline replay); loop exits on `all_met` (→ A_APPLY_GOAL_DONE) or retry max (→ escalate). Growth bounded.
462
+
463
+ ### Error Codes
464
+
465
+ | Condition | Recovery |
466
+ |-----------|----------|
467
+ | No intent and no running session | Prompt for intent |
468
+ | Cannot infer lifecycle position | Show raw state, ask user |
469
+ | Artifact dir not found for decision | Show glob results, ask user |
470
+ | Delegate verdict parse failed | Fallback: treat as "fix" |
471
+ | Step skill invocation failed | Mark step failed; auto_mode retry once then pause |
472
+ | No session for execute/continue | Suggest $maestro-ralph "intent" |
473
+
474
+ ### Success Criteria
475
+
476
+ - [ ] Lifecycle position inferred from bootstrap + artifact chain + result files
477
+ - [ ] Quality mode governs step generation
478
+ - [ ] Decomposition runs as initial step; broad intent boundary-clarified via ≤3 questions (ignores auto_mode); narrow auto-derives
479
+ - [ ] Goal registered via built-in `create_goal` with sub-goal success criteria (NOT a user-copied prompt)
480
+ - [ ] status.json enriched additively with boundary_contract + execution_criteria + task_decomposition; absent = old behavior preserved
481
+ - [ ] goal-checklist.md generated with verifiable done_when + ALL_GOALS_DONE sentinel
482
+ - [ ] post-goal-audit decision inserted before milestone-complete (only when decomposed)
483
+ - [ ] Unmet sub-goals DYNAMICALLY grow steps[] via scoped per-goal mini-loops (goal_ref tagged), loop until all_met or max retries → escalate
484
+ - [ ] Skills invoked DIRECTLY in-context — NO spawn_agents_on_csv, NO wave/CSV/worker
485
+ - [ ] Sequential execution; status.json + update_plan persisted after every step and decision
486
+ - [ ] Quality-gate / goal-gate decisions delegate-evaluated via maestro delegate --role analyze
487
+ - [ ] Confidence-based verdict adjustment applied
488
+ - [ ] -y: auto-follow verdict, no STOP (except post-debug-escalate)
489
+ - [ ] update_goal released on convergence (A_APPLY_GOAL_DONE / A_FINALIZE); held while paused
490
+
491
+ </appendix>