maestro-flow 0.4.5 → 0.4.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. package/.codex/skills/maestro-collab/SKILL.md +218 -117
  2. package/.codex/skills/maestro-execute/SKILL.md +13 -11
  3. package/.codex/skills/maestro-milestone-audit/SKILL.md +12 -10
  4. package/.codex/skills/maestro-ralph/SKILL.md +16 -4
  5. package/.codex/skills/maestro-ui-codify/SKILL.md +18 -16
  6. package/.codex/skills/manage-codebase-rebuild/SKILL.md +20 -13
  7. package/.codex/skills/manage-issue-discover/SKILL.md +19 -17
  8. package/.codex/skills/quality-debug/SKILL.md +35 -31
  9. package/.codex/skills/quality-refactor/SKILL.md +20 -12
  10. package/.codex/skills/quality-review/SKILL.md +21 -17
  11. package/.codex/skills/team-coordinate/SKILL.md +462 -235
  12. package/.codex/skills/team-coordinate/specs/role-catalog.md +132 -0
  13. package/.codex/skills/team-lifecycle-v4/SKILL.md +445 -191
  14. package/.codex/skills/team-quality-assurance/SKILL.md +205 -161
  15. package/.codex/skills/team-review/SKILL.md +198 -159
  16. package/.codex/skills/team-tech-debt/SKILL.md +214 -144
  17. package/.codex/skills/team-testing/SKILL.md +210 -158
  18. package/package.json +1 -1
  19. package/.codex/skills/team-coordinate/roles/coordinator/commands/analyze-task.md +0 -247
  20. package/.codex/skills/team-coordinate/roles/coordinator/commands/dispatch.md +0 -126
  21. package/.codex/skills/team-coordinate/roles/coordinator/commands/monitor.md +0 -265
  22. package/.codex/skills/team-coordinate/roles/coordinator/role.md +0 -403
  23. package/.codex/skills/team-coordinate/specs/knowledge-transfer.md +0 -113
  24. package/.codex/skills/team-coordinate/specs/pipelines.md +0 -97
  25. package/.codex/skills/team-coordinate/specs/quality-gates.md +0 -112
  26. package/.codex/skills/team-coordinate/specs/role-spec-template.md +0 -192
  27. package/.codex/skills/team-executor/SKILL.md +0 -116
  28. package/.codex/skills/team-executor/roles/executor/commands/monitor.md +0 -213
  29. package/.codex/skills/team-executor/roles/executor/role.md +0 -173
  30. package/.codex/skills/team-executor/specs/session-schema.md +0 -230
  31. package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/analyze.md +0 -56
  32. package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/dispatch.md +0 -61
  33. package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/monitor.md +0 -113
  34. package/.codex/skills/team-lifecycle-v4/roles/coordinator/role.md +0 -189
  35. package/.codex/skills/team-lifecycle-v4/schemas/tasks-schema.md +0 -100
  36. package/.codex/skills/team-lifecycle-v4/specs/knowledge-transfer.md +0 -204
  37. package/.codex/skills/team-quality-assurance/roles/coordinator/commands/analyze.md +0 -72
  38. package/.codex/skills/team-quality-assurance/roles/coordinator/commands/dispatch.md +0 -108
  39. package/.codex/skills/team-quality-assurance/roles/coordinator/commands/monitor.md +0 -163
  40. package/.codex/skills/team-quality-assurance/roles/coordinator/role.md +0 -177
  41. package/.codex/skills/team-review/roles/coordinator/commands/analyze.md +0 -71
  42. package/.codex/skills/team-review/roles/coordinator/commands/dispatch.md +0 -90
  43. package/.codex/skills/team-review/roles/coordinator/commands/monitor.md +0 -135
  44. package/.codex/skills/team-review/roles/coordinator/role.md +0 -176
  45. package/.codex/skills/team-tech-debt/roles/coordinator/commands/analyze.md +0 -47
  46. package/.codex/skills/team-tech-debt/roles/coordinator/commands/dispatch.md +0 -163
  47. package/.codex/skills/team-tech-debt/roles/coordinator/commands/monitor.md +0 -133
  48. package/.codex/skills/team-tech-debt/roles/coordinator/role.md +0 -173
  49. package/.codex/skills/team-testing/roles/coordinator/commands/analyze.md +0 -70
  50. package/.codex/skills/team-testing/roles/coordinator/commands/dispatch.md +0 -106
  51. package/.codex/skills/team-testing/roles/coordinator/commands/monitor.md +0 -156
  52. package/.codex/skills/team-testing/roles/coordinator/role.md +0 -185
@@ -2,83 +2,86 @@
2
2
  name: maestro-collab
3
3
  description: Use when a question needs cross-verification from multiple CLI tools or diverse analytical perspectives
4
4
  argument-hint: "\"<requirement>\" [--tools gemini,qwen,claude] [--mode analysis|write] [--rule <template>] [-y]"
5
- allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
5
+ allowed-tools: Read, Write, Edit, Glob, Grep, request_user_input
6
6
  ---
7
7
 
8
8
  <purpose>
9
- Wave-based multi-CLI collaboration via `spawn_agents_on_csv`. Diamond topology:
10
- Wave 1 (parallel CLI fan-out) → Wave 2 (cross-verify) → Wave 3 (synthesis).
9
+ Direct CLI fan-out collaboration via `exec_command`. Diamond topology:
10
+ Fan-out (parallel `exec_command` → `maestro delegate --to <tool>`) → Cross-verify (coordinator) → Synthesize (coordinator).
11
11
 
12
- Each CLI tool independently analyzes the requirement. Results cross-verified for consensus/conflicts, synthesized into unified report.
12
+ Each CLI tool independently analyzes the requirement via `maestro delegate` shell call.
13
+ Coordinator polls ALL CLI results to completion via delegate-protocol.codex.md,
14
+ then cross-verifies for consensus/conflicts and synthesizes into unified report.
15
+
16
+ NO spawn_agents_on_csv. NO spawn_agent. ALL CLI calls directly by coordinator via exec_command.
13
17
  </purpose>
14
18
 
19
+
15
20
  <context>
16
21
  $ARGUMENTS — requirement text and optional flags.
17
22
 
18
23
  **Flags**:
19
24
  - `--tools <list>`: Comma-separated CLI tools (default: first 3 enabled)
20
25
  - `--mode analysis|write`: Delegate mode (default: analysis)
21
- - `--rule <template>`: Shared rule template
26
+ - `--rule <template>`: Shared rule template for all delegates (see Rule Reference below)
22
27
  - `-y`: Skip confirmations
23
- - `-c N`: Max concurrency per wave (default: 5)
28
+
29
+ **Rule Reference** — common `--rule` values for collab scenarios:
30
+
31
+ | Scenario | Rule | Description |
32
+ |----------|------|-------------|
33
+ | Code quality review | `analysis-review-code-quality` | 代码质量多维评审 |
34
+ | Architecture review | `analysis-review-architecture` | 架构设计评审 |
35
+ | Bug root cause | `analysis-diagnose-bug-root-cause` | Bug 根因诊断 |
36
+ | Security assessment | `analysis-assess-security-risks` | 安全风险评估 |
37
+ | Performance analysis | `analysis-analyze-performance` | 性能瓶颈分析 |
38
+ | Code pattern analysis | `analysis-analyze-code-patterns` | 代码模式/反模式识别 |
39
+ | Architecture design | `planning-plan-architecture-design` | 架构方案设计 |
40
+ | Task breakdown | `planning-breakdown-task-steps` | 任务分解规划 |
41
+ | Migration strategy | `planning-plan-migration-strategy` | 迁移策略制定 |
42
+ | Rigorous style | `universal-universal-rigorous-style` | 严谨风格(通用) |
24
43
 
25
44
  **Auto-select** (no --tools): read `~/.maestro/cli-tools.json` → filter enabled + eligible → first 3. Exclude api-endpoint when --mode write. Minimum 2 required.
26
45
 
27
- **Session**: `.workflow/.csv-wave/{YYYYMMDD}-collab-{slug}/`
46
+ **Session**: `.workflow/.maestro/{YYYYMMDD}-collab-{slug}/`
28
47
  **Scratch**: `.workflow/scratch/{YYYYMMDD}-collab-{slug}/`
29
48
 
30
49
  **Output files**:
31
50
  - `collab-report.md` — merged findings (Consensus/Conflicts/Unique/Recommendations)
32
51
  - `context.md` — Locked/Free/Deferred decisions (plan compatible)
33
52
  - `conclusions.json` — session_id, tools[], consensus_level, recommendation, confidence, dimensions[], decisions[]
34
- - `per-tool/{tool}-output.md` — raw outputs
35
- </context>
36
-
37
- <csv_schema>
38
-
39
- ### tasks.csv
40
-
41
- ```csv
42
- id,title,description,tool,role,prompt,mode,rule,deps,context_from,wave,status,findings,recommendations,confidence,error
43
- "1","CLI: gemini","...","gemini","analyze","<prompt>","analysis","","","","1","","","","",""
44
- "2","CLI: claude","...","claude","analyze","<prompt>","analysis","","","","1","","","","",""
45
- "3","Cross-Verify","Compare CLI outputs: CONSENSUS/CONFLICT/UNIQUE","","","","","","1;2","1;2","2","","","","",""
46
- "4","Synthesis","Merge verified findings → collab-report.md + context.md + conclusions.json","","","","","","3","3","3","","","","",""
47
- ```
53
+ - `per-tool/{tool}-output.md` — raw CLI outputs
48
54
 
49
- Input columns: id, title, description, tool, role, prompt, mode, rule, deps, context_from, wave.
50
- Output columns: status (pending→completed/failed), findings, recommendations, confidence, error.
51
-
52
- ### Downstream Compatibility
55
+ **Downstream compatibility**:
53
56
 
54
57
  | Consumer | Artifact |
55
58
  |----------|----------|
56
59
  | maestro-plan | context.md + conclusions.json (via --dir) |
57
60
  | maestro-analyze | context.md as prior context (via state.json) |
58
61
  | maestro-ralph | artifact chain lookup (type=collab) |
59
-
60
- </csv_schema>
62
+ </context>
61
63
 
62
64
  <invariants>
63
- 1. **Wave order sacred**: Never execute wave N+1 before wave N completes
64
- 2. **CSV is source of truth**: Master tasks.csv holds all state
65
- 3. **Same prompt, different tool**: Wave 1 agents all use same base prompt, only --to differs
66
- 4. **Minimum 2 tools**: Abort if fewer eligible
67
- 5. **Delegate protocol**: All exec_command calls follow delegate-protocol.codex.md (yield_time + poll)
68
- 6. **Partial degradation**: If 1+ tool fails in wave 1, continue with remaining
69
- 7. **Discovery board append-only**: Never modify/delete discoveries.ndjson
65
+ 1. **ALL analysis via exec_command maestro delegate** coordinator NEVER performs analysis internally, NEVER spawns agents for analysis
66
+ 2. **exec_command is the execution mechanism** every delegate call: `exec_command({ cmd: "maestro delegate ..." })`
67
+ 3. **delegate-protocol.codex.md governs lifecycle** MUST follow exec_command poll write_stdin parse for every delegate
68
+ 4. **NEVER fire-and-forget** every exec_command MUST be polled to completion via write_stdin, result consumed before proceeding
69
+ 5. **NEVER substitute internal reasoning** if CLI fails, report failure; do NOT generate analysis yourself as replacement
70
+ 6. **Indefinite wait** polling has NO max timeout; continue polling until CLI returns regardless of elapsed time; NEVER abandon a running session
71
+ 6. **Same prompt, different --to** — fan-out delegates all use identical base prompt, only `--to <tool>` differs
72
+ 7. **Minimum 2 tools** — abort if fewer eligible
73
+ 8. **Partial degradation** — 1 tool fails → continue with remaining (minimum 2 results for cross-verify)
70
74
  </invariants>
71
75
 
72
76
  <state_machine>
73
77
 
74
78
  <states>
75
- S_PARSE — 解析参数、发现工具 PERSIST: —
76
- S_CONFIRM — 展示计划、用户确认(-y 跳过) PERSIST: —
77
- S_CSV_GEN 生成 tasks.csv PERSIST: tasks.csv
78
- S_WAVE_1 CLI Fan-Out (parallel spawn) PERSIST: per-tool outputs + tasks.csv
79
- S_WAVE_2Cross-Verify (single agent spawn) PERSIST: tasks.csv
80
- S_WAVE_3 Synthesis (single agent spawn) PERSIST: reports + tasks.csv
81
- S_AGGREGATE — 注册 artifact、输出摘要 PERSIST: state.json + results.csv
79
+ S_PARSE — 解析参数、发现工具 PERSIST: —
80
+ S_CONFIRM — 展示计划、用户确认(-y 跳过) PERSIST: —
81
+ S_FAN_OUT 并行 exec_command fan-out + 轮询等待 PERSIST: per-tool outputs
82
+ S_CROSS_VERIFY 交叉验证:共识/冲突/独特分类 PERSIST: cross-verify.md
83
+ S_SYNTHESIZE生成最终报告 PERSIST: reports
84
+ S_AGGREGATE 注册 artifact、输出摘要 PERSIST: state.json
82
85
  </states>
83
86
 
84
87
  <transitions>
@@ -88,25 +91,22 @@ S_PARSE:
88
91
  → ERROR(E002) WHEN: eligible tools < 2
89
92
 
90
93
  S_CONFIRM:
91
- S_CSV_GEN WHEN: -y OR user confirms "执行"
92
- → S_PARSE WHEN: user modifies tools DO: re-select, validate >= 2
94
+ S_FAN_OUT WHEN: -y OR user confirms
95
+ → S_PARSE WHEN: user modifies tools
93
96
  → END WHEN: user cancels
94
97
 
95
- S_CSV_GEN:
96
- S_WAVE_1 DO: A_GENERATE_CSV (N tool rows wave 1 + 1 verify wave 2 + 1 synthesis wave 3)
97
-
98
- S_WAVE_1:
99
- → S_WAVE_2 WHEN: 1+ agents completed DO: A_SPAWN_WAVE_1
100
- → ERROR(E004) WHEN: all failed
98
+ S_FAN_OUT:
99
+ S_CROSS_VERIFY WHEN: 2+ delegates completed DO: A_FAN_OUT_DELEGATES
100
+ → ERROR(E004) WHEN: all failed OR fewer than 2 completed
101
101
 
102
- S_WAVE_2:
103
- S_WAVE_3 DO: A_SPAWN_WAVE_2
102
+ S_CROSS_VERIFY:
103
+ S_SYNTHESIZE DO: A_CROSS_VERIFY
104
104
 
105
- S_WAVE_3:
106
- → S_AGGREGATE DO: A_SPAWN_WAVE_3
105
+ S_SYNTHESIZE:
106
+ → S_AGGREGATE DO: A_SYNTHESIZE
107
107
 
108
108
  S_AGGREGATE:
109
- → END DO: A_AGGREGATE_RESULTS
109
+ → END DO: A_AGGREGATE_RESULTS
110
110
 
111
111
  </transitions>
112
112
 
@@ -114,111 +114,212 @@ S_AGGREGATE:
114
114
 
115
115
  ### A_PARSE_AND_DISCOVER
116
116
 
117
- 1. Parse flags: requirement, tools, mode, rule, autoYes, concurrency
118
- 2. Read cli-tools.json → build eligible tool list
117
+ 1. Parse flags: requirement, tools, mode, rule, autoYes
118
+ 2. Read `~/.maestro/cli-tools.json` → build eligible tool list
119
119
  3. Auto-select if no --tools: first 3 eligible in config order
120
- 4. Load context: project.md + `maestro spec load --category arch` + `maestro wiki list --category arch`
121
- 5. Build shared delegate prompt (6-field format: PURPOSE/TASK/MODE/CONTEXT/EXPECTED/CONSTRAINTS)
120
+ 4. Build shared delegate prompt (6-field format):
121
+ ```
122
+ PURPOSE: {requirement} + cross-verification analysis
123
+ TASK: {specific analysis tasks from requirement}
124
+ MODE: {mode}
125
+ CONTEXT: @**/* | {project context if available}
126
+ EXPECTED: Structured findings with evidence, confidence per dimension
127
+ CONSTRAINTS: {scope limits}
128
+ ```
129
+ 5. Create session + scratch dirs
130
+ 6. `update_plan` with all phases pending
122
131
 
123
- ### A_GENERATE_CSV
132
+ ### A_FAN_OUT_DELEGATES
124
133
 
125
- Create session + scratch dirs. Write tasks.csv:
126
- - Wave 1: one row per selected tool (parallel)
127
- - Wave 2: cross-verify row (deps on all wave 1 IDs)
128
- - Wave 3: synthesis row (deps on wave 2 ID)
134
+ #### Phase 1: Parallel Launch
129
135
 
130
- ### A_SPAWN_WAVE_1
131
-
132
- Filter wave==1 from CSV → write wave-1.csv.
136
+ Launch ALL delegate commands simultaneously via `multi_tool_use.parallel`:
133
137
 
134
138
  ```
135
- spawn_agents_on_csv({ csv_path: "wave-1.csv", max_concurrency: N })
139
+ multi_tool_use.parallel({
140
+ tool_uses: [
141
+ {
142
+ recipient_name: "functions.exec_command",
143
+ parameters: {
144
+ cmd: "maestro delegate \"<shared_prompt>\" --to gemini --mode <mode> [--rule <rule>]",
145
+ yield_time_ms: 30000,
146
+ max_output_tokens: 6000
147
+ }
148
+ },
149
+ {
150
+ recipient_name: "functions.exec_command",
151
+ parameters: {
152
+ cmd: "maestro delegate \"<shared_prompt>\" --to claude --mode <mode> [--rule <rule>]",
153
+ yield_time_ms: 30000,
154
+ max_output_tokens: 6000
155
+ }
156
+ }
157
+ // ... one entry per selected tool
158
+ ]
159
+ })
136
160
  ```
137
161
 
138
- **Agent instruction**: Execute `maestro delegate "<prompt>" --to <tool> --mode <mode>` via exec_command (delegate-protocol.codex.md). Write output to per-tool/{tool}-output.md. Extract findings/recommendations/confidence. Append discoveries.ndjson.
162
+ #### Phase 2: Block Until ALL Complete
139
163
 
140
- Merge results master tasks.csv.
164
+ For each result from Phase 1, check completion status:
141
165
 
142
- ### A_SPAWN_WAVE_2
166
+ - **Completed** (no session_id) → save output directly to `{scratchDir}/per-tool/{tool}-output.md`
167
+ - **Running** (session_id returned) → add to `pending_sessions[]`
143
168
 
144
- Filter wave==2 write wave-2.csv. Build prev_context from wave 1 findings.
169
+ **Blocking poll loop runs until pending_sessions is empty:**
145
170
 
146
171
  ```
147
- spawn_agents_on_csv({ csv_path: "wave-2.csv", max_concurrency: 1 })
172
+ pending_sessions = [{ tool, session_id }, ...]
173
+
174
+ WHILE pending_sessions.length > 0:
175
+ FOR EACH session IN pending_sessions:
176
+ result = write_stdin({
177
+ session_id: session.session_id,
178
+ chars: "",
179
+ yield_time_ms: 60000, // 60s per poll — no rush, wait for real output
180
+ max_output_tokens: 6000
181
+ })
182
+
183
+ IF result indicates completed:
184
+ save output → {scratchDir}/per-tool/{session.tool}-output.md
185
+ REMOVE session FROM pending_sessions
186
+ completed_count += 1
187
+
188
+ IF result indicates failed/error:
189
+ log error for session.tool
190
+ REMOVE session FROM pending_sessions
191
+ failed_count += 1
192
+
193
+ // still running → stays in pending_sessions, poll again next round
148
194
  ```
149
195
 
150
- **Agent instruction**: Read all per-tool outputs + discoveries.ndjson. Classify each finding:
196
+ **Blocking guarantees:**
197
+ - `yield_time_ms: 60000` — each poll waits up to 60s for output, no short-circuit
198
+ - NO max retry count — loop continues indefinitely until CLI returns
199
+ - NO timeout escalation — delegate can run as long as needed (30s to 10min+)
200
+ - NO early exit — even if tool 1 and 2 are done, keep polling tool 3 until it completes
201
+ - Round-robin ensures fair polling across all pending sessions
202
+
203
+ #### Phase 3: Validate
204
+
205
+ - Count completed tools
206
+ - completed < 2 → ERROR(E004)
207
+ - 1 tool failed but 2+ succeeded → W001, log failure, continue
208
+
209
+ **Iron rules**:
210
+ - NEVER skip polling — every session_id MUST be polled to completion
211
+ - NEVER proceed to S_CROSS_VERIFY while pending_sessions is non-empty
212
+ - NEVER set a max timeout or max retry count on the poll loop
213
+ - NEVER generate analysis internally as substitute for CLI output
214
+ - NEVER summarize or paraphrase — save raw CLI output verbatim
215
+
216
+ ### A_CROSS_VERIFY
217
+
218
+ Coordinator reads ALL per-tool outputs from `{scratchDir}/per-tool/` and classifies each finding:
151
219
 
152
220
  | Condition | Tag |
153
221
  |-----------|-----|
154
- | 2+ tools agree | CONSENSUS |
155
- | Tools disagree | CONFLICT |
156
- | 1 tool only | UNIQUE |
222
+ | 2+ tools agree on same finding | CONSENSUS |
223
+ | Tools have contradictory findings | CONFLICT |
224
+ | Only 1 tool identified | UNIQUE |
157
225
 
158
- Compute consensus_level = consensus_count / total * 100.
226
+ For each CONFLICT: note which tools disagree, their evidence, and confidence levels.
159
227
 
160
- Merge results master tasks.csv.
228
+ Compute: `consensus_level = consensus_count / total_findings * 100`
161
229
 
162
- ### A_SPAWN_WAVE_3
230
+ Write results to `{scratchDir}/cross-verify.md`.
163
231
 
164
- Filter wave==3 → write wave-3.csv. Build prev_context from wave 2 findings.
232
+ ### A_SYNTHESIZE
165
233
 
166
- ```
167
- spawn_agents_on_csv({ csv_path: "wave-3.csv", max_concurrency: 1 })
168
- ```
234
+ Generate 3 output files from cross-verify results:
169
235
 
170
- **Agent instruction**: Resolve conflicts via evidence-weighted voting (higher confidence wins, specific evidence > general). Generate 3 files:
171
- 1. **collab-report.md**: Summary, Consensus Findings, Resolved Conflicts, Unresolved Items, Unique Insights, Recommendations, Per-Tool Confidence table
172
- 2. **context.md**: Locked (CONSENSUS), Free (UNIQUE w/ strong evidence), Deferred (UNRESOLVED). Standard Locked/Free/Deferred format.
173
- 3. **conclusions.json**: session_id, subject, mode, tools[], consensus_level, recommendation (Go/No-Go/Conditional), confidence, dimensions[], decisions[]
236
+ 1. **collab-report.md**:
237
+ ```markdown
238
+ # Collaborative Analysis: {requirement}
174
239
 
175
- Merge results → master tasks.csv.
240
+ ## Summary
241
+ Tools: {tool list} | Consensus: {consensus_level}%
176
242
 
177
- ### A_AGGREGATE_RESULTS
243
+ ## Consensus Findings
244
+ {findings agreed by 2+ tools, with evidence}
178
245
 
179
- 1. Export tasks.csv → results.csv
180
- 2. Verify outputs exist (fallback: build minimal from available findings)
181
- 3. Copy collab-report.md + context.md + conclusions.json → scratchDir
182
- 4. Register CLB artifact in state.json (type: collab, scope: adhoc)
183
- 5. Spec enrichment: for each Locked decision → `maestro spec add arch`
184
- 6. Display summary (requirement, tools, consensus_level, per-tool status, artifact ID, next steps)
246
+ ## Conflicts
247
+ {contradictory findings with per-tool positions and evidence}
185
248
 
186
- </actions>
249
+ ## Unique Insights
250
+ {single-tool findings worth noting}
187
251
 
188
- </state_machine>
252
+ ## Recommendations
253
+ {actionable recommendations, prioritized}
189
254
 
190
- <discovery_board>
255
+ ## Per-Tool Confidence
256
+ | Tool | Confidence | Key Contribution |
257
+ |------|-----------|-----------------|
258
+ ```
191
259
 
192
- | Type | Dedup Key | Data |
193
- |------|-----------|------|
194
- | cli_finding | tool+dimension | {tool, dimension, finding, confidence, evidence} |
195
- | consensus | area | {area, tools[], finding, confidence} |
196
- | conflict | area | {area, positions[{tool, stance, evidence}], resolution} |
197
- | unique_insight | tool+finding | {tool, finding, significance, actionable} |
260
+ 2. **context.md**: Locked (CONSENSUS) / Free (UNIQUE w/ strong evidence) / Deferred (CONFLICT unresolved)
198
261
 
199
- Protocol: read before analysis, append-only, dedup by type+key.
200
- </discovery_board>
262
+ 3. **conclusions.json**:
263
+ ```json
264
+ {
265
+ "session_id": "", "subject": "", "mode": "",
266
+ "tools": [], "consensus_level": 0,
267
+ "recommendation": "Go|No-Go|Conditional",
268
+ "confidence": 0,
269
+ "dimensions": [{ "name": "", "consensus": "", "details": "" }],
270
+ "decisions": [{ "area": "", "status": "locked|free|deferred", "rationale": "" }]
271
+ }
272
+ ```
273
+
274
+ ### A_AGGREGATE_RESULTS
275
+
276
+ 1. Copy outputs to scratchDir
277
+ 2. Register CLB artifact in state.json (type: collab, scope: adhoc)
278
+ 3. Spec enrichment: for each Locked decision → `maestro spec add arch`
279
+ 4. `update_plan` all steps completed
280
+ 5. Display summary:
281
+ ```
282
+ == Collab Analysis Complete ==
283
+ Requirement: {requirement}
284
+ Tools: {tool list with status}
285
+ Consensus Level: {consensus_level}%
286
+
287
+ Key Findings:
288
+ CONSENSUS: {count}
289
+ CONFLICT: {count}
290
+ UNIQUE: {count}
291
+
292
+ Reports: {scratchDir}/collab-report.md
293
+ Next: $maestro-plan --dir {scratchDir}
294
+ ```
295
+
296
+ </actions>
297
+
298
+ </state_machine>
201
299
 
202
300
  <error_codes>
203
301
  | Code | Condition | Recovery |
204
302
  |------|-----------|----------|
205
- | E002 | Fewer than 2 eligible tools | Check cli-tools.json |
206
- | E004 | All wave 1 delegates failed | Abort with per-tool error details |
207
- | W001 | One tool failed wave 1 | Continue with remaining |
208
- | W003 | Synthesis failed | Use cross-verify output as fallback |
209
- | W004 | consensus_level < 40% | Flag in summary |
303
+ | E002 | Fewer than 2 eligible tools | Check cli-tools.json, specify --tools |
304
+ | E004 | All delegates failed or < 2 completed | Show per-tool errors, abort |
305
+ | W001 | One tool failed | Continue with remaining |
306
+ | W004 | consensus_level < 40% | Flag in summary as low-confidence |
210
307
  </error_codes>
211
308
 
212
309
  <success_criteria>
213
- - [ ] Wave 1: all delegates via delegate-protocol.codex.md, per-tool outputs written
214
- - [ ] Wave 2: consensus/conflict/unique classified, consensus_level computed
215
- - [ ] Wave 3: collab-report.md + context.md + conclusions.json produced
216
- - [ ] CLB artifact registered, outputs copied to scratchDir
217
- - [ ] Partial degradation: continued if 1+ tools succeeded
310
+ - [ ] ALL analysis performed via exec_command → maestro delegate zero internal analysis
311
+ - [ ] multi_tool_use.parallel used for fan-out launch
312
+ - [ ] Every exec_command polled to completion via write_stdin — no timeout cap, no max retries
313
+ - [ ] Blocking poll loop ran until pending_sessions empty — no early exit
314
+ - [ ] Per-tool raw outputs saved to {scratchDir}/per-tool/
315
+ - [ ] Cross-verify: CONSENSUS/CONFLICT/UNIQUE classified, consensus_level computed
316
+ - [ ] collab-report.md + context.md + conclusions.json produced
317
+ - [ ] CLB artifact registered in state.json
318
+ - [ ] Partial degradation: continued if 2+ tools succeeded
218
319
  </success_criteria>
219
320
 
220
321
  <next_step_routing>
221
322
  - Deep feasibility analysis → `$maestro-analyze "{topic}"`
222
- - Plan from conclusions → `$maestro-plan --dir {dir}`
323
+ - Plan from conclusions → `$maestro-plan --dir {scratchDir}`
223
324
  - Expand exploration → `$maestro-brainstorm "{topic}"`
224
325
  </next_step_routing>
@@ -94,12 +94,12 @@ When `--yes` or `-y`: Auto-confirm task breakdown, skip blocked-task prompts, au
94
94
  ### tasks.csv (Master State)
95
95
 
96
96
  ```csv
97
- id,title,description,scope,convergence_criteria,hints,execution_directives,deps,context_from,wave,status,findings,files_modified,tests_passed,error
98
- "TASK-001","Setup auth module","Create authentication module with JWT token generation and verification. Export verifyToken and generateToken functions.","src/auth/","auth.ts contains export function verifyToken(; auth.ts contains export function generateToken(","Reference existing middleware pattern in src/middleware/auth.ts","npm test -- --grep auth","","","1","","","","",""
99
- "TASK-002","Create user model","Define User interface and database schema with email, passwordHash, role fields. Use existing Result type pattern.","src/models/","user.ts contains export interface User; user.ts contains email: string","See src/models/session.ts for existing model pattern","npm test -- --grep user","","","1","","","","",""
100
- "TASK-003","Auth middleware","Create Express middleware that validates JWT from Authorization header. Use verifyToken from auth module. Return 401 on invalid token.","src/middleware/","auth-middleware.ts contains export function authMiddleware(; auth-middleware.ts contains verifyToken","Follows existing middleware pattern in src/middleware/logging.ts","npm test -- --grep middleware","TASK-001","TASK-001","2","","","","",""
101
- "TASK-004","Login endpoint","Implement POST /api/login endpoint. Validate credentials against user model, return JWT on success. Use generateToken from auth module.","src/routes/","login.ts contains router.post('/api/login'; login.ts contains generateToken(","Wire into existing Express app in src/app.ts","curl -X POST localhost:3000/api/login","TASK-001;TASK-002","TASK-001;TASK-002","2","","","","",""
102
- "TASK-005","Integration tests","Write integration tests for full auth flow: register, login, access protected route, token refresh.","tests/","tests/auth.test.ts exists; npm test exits with code 0","Use existing test setup in tests/setup.ts","npm test","TASK-003;TASK-004","TASK-003;TASK-004","3","","","","",""
97
+ id,title,description,scope,convergence_criteria,hints,execution_directives,deps,context_from,wave
98
+ "TASK-001","Setup auth module","Create authentication module with JWT token generation and verification. Export verifyToken and generateToken functions.","src/auth/","auth.ts contains export function verifyToken(; auth.ts contains export function generateToken(","Reference existing middleware pattern in src/middleware/auth.ts","npm test -- --grep auth","","","1"
99
+ "TASK-002","Create user model","Define User interface and database schema with email, passwordHash, role fields. Use existing Result type pattern.","src/models/","user.ts contains export interface User; user.ts contains email: string","See src/models/session.ts for existing model pattern","npm test -- --grep user","","","1"
100
+ "TASK-003","Auth middleware","Create Express middleware that validates JWT from Authorization header. Use verifyToken from auth module. Return 401 on invalid token.","src/middleware/","auth-middleware.ts contains export function authMiddleware(; auth-middleware.ts contains verifyToken","Follows existing middleware pattern in src/middleware/logging.ts","npm test -- --grep middleware","TASK-001","TASK-001","2"
101
+ "TASK-004","Login endpoint","Implement POST /api/login endpoint. Validate credentials against user model, return JWT on success. Use generateToken from auth module.","src/routes/","login.ts contains router.post('/api/login'; login.ts contains generateToken(","Wire into existing Express app in src/app.ts","curl -X POST localhost:3000/api/login","TASK-001;TASK-002","TASK-001;TASK-002","2"
102
+ "TASK-005","Integration tests","Write integration tests for full auth flow: register, login, access protected route, token refresh.","tests/","tests/auth.test.ts exists; npm test exits with code 0","Use existing test setup in tests/setup.ts","npm test","TASK-003;TASK-004","TASK-003;TASK-004","3"
103
103
  ```
104
104
 
105
105
  **Columns**:
@@ -116,12 +116,14 @@ id,title,description,scope,convergence_criteria,hints,execution_directives,deps,
116
116
  | `deps` | Input | Semicolon-separated dependency task IDs |
117
117
  | `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
118
118
  | `wave` | Computed | Wave number from plan.json wave assignment |
119
- | `status` | Output | `pending` -> `completed` / `failed` / `blocked` / `skipped` |
119
+ | `status` | Output | `pending` -> `completed` / `failed` / `blocked` / `skipped` (mapped from output_schema `result_status`) |
120
120
  | `findings` | Output | Implementation notes and observations (max 500 chars) |
121
121
  | `files_modified` | Output | Semicolon-separated list of created/modified files |
122
122
  | `tests_passed` | Output | Test pass/fail status from execution_directives |
123
123
  | `error` | Output | Error message if failed or blocked |
124
124
 
125
+ **Column separation rule**: Wave CSV (input to spawn_agents_on_csv) and output_schema MUST NOT share column names. Wave CSV only contains Input columns + prev_context. Output columns are returned exclusively via output_schema (using `result_status`, not `status`). During merge, `result_status` maps back to the master CSV's `status` column.
126
+
125
127
  ### Per-Wave CSV (Temporary)
126
128
 
127
129
  Each wave generates `wave-{N}.csv` with extra `prev_context` column populated from predecessor task findings.
@@ -132,7 +134,7 @@ Each wave generates `wave-{N}.csv` with extra `prev_context` column populated fr
132
134
  |------|---------|-----------|
133
135
  | `tasks.csv` | Master state -- all tasks with status/findings | Updated after each wave |
134
136
  | `wave-{N}.csv` | Per-wave input (temporary) | Created before wave, deleted after |
135
- | `wave-{N}-results.csv` | Per-wave output | Created by spawn_agents_on_csv |
137
+ | `wave-{N}-results.csv` | Per-wave output (uses `result_status`) | Created by spawn_agents_on_csv, deleted after merge |
136
138
  | `results.csv` | Final export of all task results | Created in Phase 3 |
137
139
  | `discoveries.ndjson` | Shared exploration board | Append-only, carries across waves |
138
140
  | `context.md` | Human-readable execution report | Created in Phase 3 |
@@ -158,7 +160,7 @@ Each wave generates `wave-{N}.csv` with extra `prev_context` column populated fr
158
160
  4. **Context Propagation**: prev_context built from master CSV findings, not from memory
159
161
  5. **Discovery Board is Append-Only**: Never clear, modify, or recreate discoveries.ndjson
160
162
  6. **Cascading Skip on Failure**: If a task fails/blocks, all dependent tasks are skipped
161
- 7. **Cleanup Temp Files**: Remove wave-{N}.csv after results are merged
163
+ 7. **Cleanup Temp Files**: Remove `wave-{N}.csv` AND `wave-{N}-results.csv` after results are merged
162
164
  8. **Max 3 Fix Attempts**: Per task, auto-fix convergence failures up to 3 times, then mark blocked
163
165
  9. **Breakpoint Resume**: Always detect completed tasks and skip them on re-run
164
166
  10. **DO NOT STOP**: Continuous execution until all waves complete or user explicitly stops
@@ -251,11 +253,11 @@ spawn_agents_on_csv({
251
253
  instruction: buildExecutorInstruction(sessionFolder, phaseDir, autoCommit, specsContent), // agent: ~/.codex/agents/workflow-executor.toml
252
254
  max_concurrency: maxConcurrency, max_runtime_seconds: 3600,
253
255
  output_csv_path: `${sessionFolder}/wave-${N}-results.csv`,
254
- output_schema: { id, status: [completed|failed|blocked], findings, files_modified, tests_passed, error }
256
+ output_schema: { id, result_status: [completed|failed|blocked], findings, files_modified, tests_passed, error }
255
257
  })
256
258
  ```
257
259
 
258
- 4. Merge results into master `tasks.csv`, delete `wave-{N}.csv`
260
+ 4. Merge results into master `tasks.csv`: map `result_status` from `wave-{N}-results.csv` to the `status` column in master CSV. Delete `wave-{N}.csv` AND `wave-{N}-results.csv` after merge.
259
261
 
260
262
  #### Blocked Task Handling
261
263
 
@@ -28,9 +28,9 @@ $maestro-milestone-audit "M1"
28
28
  ### tasks.csv (Master State)
29
29
 
30
30
  ```csv
31
- id,title,description,scope,check_targets,deps,wave,status,findings,gaps_found,severity,error
32
- "integ-1","Interface & dependency chains","Verify shared interfaces are consistent across phases: re-exports match, dependency chains unbroken, no circular imports between phase outputs","cross-phase imports, shared types, re-exports","grep for shared type names across phase output dirs; verify export/import consistency","","1","","","","",""
33
- "integ-2","Data contracts & API consistency","Verify request/response schemas match across phases: API signatures consistent, error codes aligned, no contract drift","request/response schemas, API signatures, error codes","diff API type definitions across phases; check error code enum consistency","","1","","","","",""
31
+ id,title,description,scope,check_targets,deps,wave
32
+ "integ-1","Interface & dependency chains","Verify shared interfaces are consistent across phases: re-exports match, dependency chains unbroken, no circular imports between phase outputs","cross-phase imports, shared types, re-exports","grep for shared type names across phase output dirs; verify export/import consistency","","1"
33
+ "integ-2","Data contracts & API consistency","Verify request/response schemas match across phases: API signatures consistent, error codes aligned, no contract drift","request/response schemas, API signatures, error codes","diff API type definitions across phases; check error code enum consistency","","1"
34
34
  ```
35
35
 
36
36
  **Columns**:
@@ -44,19 +44,21 @@ id,title,description,scope,check_targets,deps,wave,status,findings,gaps_found,se
44
44
  | `check_targets` | Input | Specific verification commands/grep patterns |
45
45
  | `deps` | Input | Dependencies (empty — all wave 1) |
46
46
  | `wave` | Computed | Wave number (always 1 — single parallel wave) |
47
- | `status` | Output | `pending` -> `pass` / `fail` / `warning` |
47
+ | `result_status` | Output | `pass` / `fail` / `warning` |
48
48
  | `findings` | Output | Detailed findings per dimension (max 500 chars) |
49
49
  | `gaps_found` | Output | Semicolon-separated list of integration gaps |
50
50
  | `severity` | Output | `critical` / `warning` / `info` per gap |
51
51
  | `error` | Output | Error message if check failed |
52
52
 
53
+ **Column separation rule**: Input columns and Output columns MUST NOT share names. Wave CSV only contains Input columns. Output columns are returned exclusively via output_schema.
54
+
53
55
  ### Session Structure
54
56
 
55
57
  ```
56
58
  .workflow/.csv-wave/{YYYYMMDD}-audit-{milestone}/
57
59
  +-- tasks.csv
58
- +-- wave-1.csv (temporary)
59
- +-- wave-1-results.csv
60
+ +-- wave-1.csv (temporary, deleted after merge)
61
+ +-- wave-1-results.csv (temporary, deleted after merge)
60
62
  ```
61
63
  </csv_schema>
62
64
 
@@ -95,16 +97,16 @@ Verify all adhoc-scoped artifacts completed. For each execute artifact, verify a
95
97
  spawn_agents_on_csv({
96
98
  csv_path: `${sessionFolder}/wave-1.csv`,
97
99
  id_column: "id",
98
- instruction: `You are an integration checker for milestone ${milestone}. For each row, examine the scope and check_targets. Search the codebase for inconsistencies, contract drift, and broken dependencies across phase outputs. Report findings with file:line references. Set status to pass/fail/warning. List specific gaps in gaps_found (semicolon-separated).`,
100
+ instruction: `You are an integration checker for milestone ${milestone}. For each row, examine the scope and check_targets. Search the codebase for inconsistencies, contract drift, and broken dependencies across phase outputs. Report findings with file:line references. Set result_status to pass/fail/warning. List specific gaps in gaps_found (semicolon-separated).`,
99
101
  max_concurrency: 2, max_runtime_seconds: 600,
100
102
  output_csv_path: `${sessionFolder}/wave-1-results.csv`,
101
- output_schema: { id, status: [pass|fail|warning], findings, gaps_found, severity, error }
103
+ output_schema: { id, result_status: [pass|fail|warning], findings, gaps_found, severity, error }
102
104
  })
103
105
  ```
104
106
 
105
- 4. Merge results into master `tasks.csv`
107
+ 4. Merge results into master `tasks.csv`: map `result_status` → master `status` column, copy `findings`, `gaps_found`, `severity`, `error`. Delete temporary files (`wave-1.csv`, `wave-1-results.csv`) after merge.
106
108
  5. Parse `gaps_found` from all workers — aggregate into `.workflow/milestones/{milestone}/audit-report.md`
107
- 6. Any worker with `status == fail` and `severity == critical` → milestone verdict = FAIL
109
+ 6. Any worker with `result_status == fail` and `severity == critical` → milestone verdict = FAIL
108
110
 
109
111
  ### Step 6: Verdict
110
112
 
@@ -99,8 +99,8 @@ S_BUILD_CHAIN:
99
99
  -> S_CREATE_SESSION DO: A_BUILD_STEPS
100
100
 
101
101
  S_CREATE_SESSION:
102
- -> S_CONFIRM WHEN: not auto_mode
103
- -> S_LOAD_NEXT WHEN: auto_mode
102
+ -> S_CONFIRM WHEN: not auto_mode DO: A_CREATE_SESSION
103
+ -> S_LOAD_NEXT WHEN: auto_mode DO: A_CREATE_SESSION
104
104
 
105
105
  S_CONFIRM:
106
106
  -> S_LOAD_NEXT WHEN: "Proceed"
@@ -220,6 +220,13 @@ Priority: regex from intent `phase\s*(\d+)` -> latest in-progress artifact's pha
220
220
  6. Args use placeholders `{phase}`, `{intent}`, `{dirs}` — resolved at wave execution time
221
221
  7. Append `-y` to all skill args when `auto_mode` is true (see -y propagation table in context)
222
222
 
223
+ ### A_CREATE_SESSION
224
+
225
+ 1. Write `.workflow/.maestro/ralph-{YYYYMMDD-HHmmss}/status.json` (see Session JSON Schema)
226
+ 2. Initialize tracking:
227
+ - `create_goal({ objective: "Ralph lifecycle: {quality_mode} mode, {N} steps from {lifecycle_position}" })`
228
+ - `update_plan({ plan: steps.map(step => { step, status: "pending" }) })`
229
+
223
230
  ### A_BUILD_AND_SPAWN_WAVE
224
231
 
225
232
  1. Conditional step eval: check_coverage -> read validation.json, skip if >= threshold
@@ -255,11 +262,16 @@ Update session: milestone, phase, reset passed_gates. Re-infer quality_mode. Bui
255
262
 
256
263
  ### A_FINALIZE
257
264
 
258
- Set status = "completed". Sync update_plan. Release goal. Display completion report.
265
+ 1. Set `session.status = "completed"`, write status.json
266
+ 2. Sync update_plan: all steps → "completed"
267
+ 3. `update_goal({ status: "complete" })` — release goal constraint
268
+ 4. Display completion report
259
269
 
260
270
  ### A_PAUSE_SESSION
261
271
 
262
- Set status = "paused". Do NOT release goal. Display: use $maestro-ralph execute to continue.
272
+ 1. Set `session.status = "paused"`, write status.json
273
+ 2. Do NOT call `update_goal` — goal stays for `execute`/`continue` resume
274
+ 3. Display: use `$maestro-ralph execute` to continue
263
275
 
264
276
  </actions>
265
277