maestro-flow 0.4.5 → 0.4.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.codex/skills/maestro-collab/SKILL.md +218 -117
- package/.codex/skills/maestro-execute/SKILL.md +13 -11
- package/.codex/skills/maestro-milestone-audit/SKILL.md +12 -10
- package/.codex/skills/maestro-ralph/SKILL.md +16 -4
- package/.codex/skills/maestro-ui-codify/SKILL.md +18 -16
- package/.codex/skills/manage-codebase-rebuild/SKILL.md +20 -13
- package/.codex/skills/manage-issue-discover/SKILL.md +19 -17
- package/.codex/skills/quality-debug/SKILL.md +35 -31
- package/.codex/skills/quality-refactor/SKILL.md +20 -12
- package/.codex/skills/quality-review/SKILL.md +21 -17
- package/.codex/skills/team-coordinate/SKILL.md +462 -235
- package/.codex/skills/team-coordinate/specs/role-catalog.md +132 -0
- package/.codex/skills/team-lifecycle-v4/SKILL.md +445 -191
- package/.codex/skills/team-quality-assurance/SKILL.md +205 -161
- package/.codex/skills/team-review/SKILL.md +198 -159
- package/.codex/skills/team-tech-debt/SKILL.md +214 -144
- package/.codex/skills/team-testing/SKILL.md +210 -158
- package/package.json +1 -1
- package/.codex/skills/team-coordinate/roles/coordinator/commands/analyze-task.md +0 -247
- package/.codex/skills/team-coordinate/roles/coordinator/commands/dispatch.md +0 -126
- package/.codex/skills/team-coordinate/roles/coordinator/commands/monitor.md +0 -265
- package/.codex/skills/team-coordinate/roles/coordinator/role.md +0 -403
- package/.codex/skills/team-coordinate/specs/knowledge-transfer.md +0 -113
- package/.codex/skills/team-coordinate/specs/pipelines.md +0 -97
- package/.codex/skills/team-coordinate/specs/quality-gates.md +0 -112
- package/.codex/skills/team-coordinate/specs/role-spec-template.md +0 -192
- package/.codex/skills/team-executor/SKILL.md +0 -116
- package/.codex/skills/team-executor/roles/executor/commands/monitor.md +0 -213
- package/.codex/skills/team-executor/roles/executor/role.md +0 -173
- package/.codex/skills/team-executor/specs/session-schema.md +0 -230
- package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/analyze.md +0 -56
- package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/dispatch.md +0 -61
- package/.codex/skills/team-lifecycle-v4/roles/coordinator/commands/monitor.md +0 -113
- package/.codex/skills/team-lifecycle-v4/roles/coordinator/role.md +0 -189
- package/.codex/skills/team-lifecycle-v4/schemas/tasks-schema.md +0 -100
- package/.codex/skills/team-lifecycle-v4/specs/knowledge-transfer.md +0 -204
- package/.codex/skills/team-quality-assurance/roles/coordinator/commands/analyze.md +0 -72
- package/.codex/skills/team-quality-assurance/roles/coordinator/commands/dispatch.md +0 -108
- package/.codex/skills/team-quality-assurance/roles/coordinator/commands/monitor.md +0 -163
- package/.codex/skills/team-quality-assurance/roles/coordinator/role.md +0 -177
- package/.codex/skills/team-review/roles/coordinator/commands/analyze.md +0 -71
- package/.codex/skills/team-review/roles/coordinator/commands/dispatch.md +0 -90
- package/.codex/skills/team-review/roles/coordinator/commands/monitor.md +0 -135
- package/.codex/skills/team-review/roles/coordinator/role.md +0 -176
- package/.codex/skills/team-tech-debt/roles/coordinator/commands/analyze.md +0 -47
- package/.codex/skills/team-tech-debt/roles/coordinator/commands/dispatch.md +0 -163
- package/.codex/skills/team-tech-debt/roles/coordinator/commands/monitor.md +0 -133
- package/.codex/skills/team-tech-debt/roles/coordinator/role.md +0 -173
- package/.codex/skills/team-testing/roles/coordinator/commands/analyze.md +0 -70
- package/.codex/skills/team-testing/roles/coordinator/commands/dispatch.md +0 -106
- package/.codex/skills/team-testing/roles/coordinator/commands/monitor.md +0 -156
- package/.codex/skills/team-testing/roles/coordinator/role.md +0 -185
|
@@ -2,83 +2,86 @@
|
|
|
2
2
|
name: maestro-collab
|
|
3
3
|
description: Use when a question needs cross-verification from multiple CLI tools or diverse analytical perspectives
|
|
4
4
|
argument-hint: "\"<requirement>\" [--tools gemini,qwen,claude] [--mode analysis|write] [--rule <template>] [-y]"
|
|
5
|
-
allowed-tools:
|
|
5
|
+
allowed-tools: Read, Write, Edit, Glob, Grep, request_user_input
|
|
6
6
|
---
|
|
7
7
|
|
|
8
8
|
<purpose>
|
|
9
|
-
|
|
10
|
-
|
|
9
|
+
Direct CLI fan-out collaboration via `exec_command`. Diamond topology:
|
|
10
|
+
Fan-out (parallel `exec_command` → `maestro delegate --to <tool>`) → Cross-verify (coordinator) → Synthesize (coordinator).
|
|
11
11
|
|
|
12
|
-
Each CLI tool independently analyzes the requirement
|
|
12
|
+
Each CLI tool independently analyzes the requirement via `maestro delegate` shell call.
|
|
13
|
+
Coordinator polls ALL CLI results to completion via delegate-protocol.codex.md,
|
|
14
|
+
then cross-verifies for consensus/conflicts and synthesizes into unified report.
|
|
15
|
+
|
|
16
|
+
NO spawn_agents_on_csv. NO spawn_agent. ALL CLI calls directly by coordinator via exec_command.
|
|
13
17
|
</purpose>
|
|
14
18
|
|
|
19
|
+
|
|
15
20
|
<context>
|
|
16
21
|
$ARGUMENTS — requirement text and optional flags.
|
|
17
22
|
|
|
18
23
|
**Flags**:
|
|
19
24
|
- `--tools <list>`: Comma-separated CLI tools (default: first 3 enabled)
|
|
20
25
|
- `--mode analysis|write`: Delegate mode (default: analysis)
|
|
21
|
-
- `--rule <template>`: Shared rule template
|
|
26
|
+
- `--rule <template>`: Shared rule template for all delegates (see Rule Reference below)
|
|
22
27
|
- `-y`: Skip confirmations
|
|
23
|
-
|
|
28
|
+
|
|
29
|
+
**Rule Reference** — common `--rule` values for collab scenarios:
|
|
30
|
+
|
|
31
|
+
| Scenario | Rule | Description |
|
|
32
|
+
|----------|------|-------------|
|
|
33
|
+
| Code quality review | `analysis-review-code-quality` | 代码质量多维评审 |
|
|
34
|
+
| Architecture review | `analysis-review-architecture` | 架构设计评审 |
|
|
35
|
+
| Bug root cause | `analysis-diagnose-bug-root-cause` | Bug 根因诊断 |
|
|
36
|
+
| Security assessment | `analysis-assess-security-risks` | 安全风险评估 |
|
|
37
|
+
| Performance analysis | `analysis-analyze-performance` | 性能瓶颈分析 |
|
|
38
|
+
| Code pattern analysis | `analysis-analyze-code-patterns` | 代码模式/反模式识别 |
|
|
39
|
+
| Architecture design | `planning-plan-architecture-design` | 架构方案设计 |
|
|
40
|
+
| Task breakdown | `planning-breakdown-task-steps` | 任务分解规划 |
|
|
41
|
+
| Migration strategy | `planning-plan-migration-strategy` | 迁移策略制定 |
|
|
42
|
+
| Rigorous style | `universal-universal-rigorous-style` | 严谨风格(通用) |
|
|
24
43
|
|
|
25
44
|
**Auto-select** (no --tools): read `~/.maestro/cli-tools.json` → filter enabled + eligible → first 3. Exclude api-endpoint when --mode write. Minimum 2 required.
|
|
26
45
|
|
|
27
|
-
**Session**: `.workflow/.
|
|
46
|
+
**Session**: `.workflow/.maestro/{YYYYMMDD}-collab-{slug}/`
|
|
28
47
|
**Scratch**: `.workflow/scratch/{YYYYMMDD}-collab-{slug}/`
|
|
29
48
|
|
|
30
49
|
**Output files**:
|
|
31
50
|
- `collab-report.md` — merged findings (Consensus/Conflicts/Unique/Recommendations)
|
|
32
51
|
- `context.md` — Locked/Free/Deferred decisions (plan compatible)
|
|
33
52
|
- `conclusions.json` — session_id, tools[], consensus_level, recommendation, confidence, dimensions[], decisions[]
|
|
34
|
-
- `per-tool/{tool}-output.md` — raw outputs
|
|
35
|
-
</context>
|
|
36
|
-
|
|
37
|
-
<csv_schema>
|
|
38
|
-
|
|
39
|
-
### tasks.csv
|
|
40
|
-
|
|
41
|
-
```csv
|
|
42
|
-
id,title,description,tool,role,prompt,mode,rule,deps,context_from,wave,status,findings,recommendations,confidence,error
|
|
43
|
-
"1","CLI: gemini","...","gemini","analyze","<prompt>","analysis","","","","1","","","","",""
|
|
44
|
-
"2","CLI: claude","...","claude","analyze","<prompt>","analysis","","","","1","","","","",""
|
|
45
|
-
"3","Cross-Verify","Compare CLI outputs: CONSENSUS/CONFLICT/UNIQUE","","","","","","1;2","1;2","2","","","","",""
|
|
46
|
-
"4","Synthesis","Merge verified findings → collab-report.md + context.md + conclusions.json","","","","","","3","3","3","","","","",""
|
|
47
|
-
```
|
|
53
|
+
- `per-tool/{tool}-output.md` — raw CLI outputs
|
|
48
54
|
|
|
49
|
-
|
|
50
|
-
Output columns: status (pending→completed/failed), findings, recommendations, confidence, error.
|
|
51
|
-
|
|
52
|
-
### Downstream Compatibility
|
|
55
|
+
**Downstream compatibility**:
|
|
53
56
|
|
|
54
57
|
| Consumer | Artifact |
|
|
55
58
|
|----------|----------|
|
|
56
59
|
| maestro-plan | context.md + conclusions.json (via --dir) |
|
|
57
60
|
| maestro-analyze | context.md as prior context (via state.json) |
|
|
58
61
|
| maestro-ralph | artifact chain lookup (type=collab) |
|
|
59
|
-
|
|
60
|
-
</csv_schema>
|
|
62
|
+
</context>
|
|
61
63
|
|
|
62
64
|
<invariants>
|
|
63
|
-
1. **
|
|
64
|
-
2. **
|
|
65
|
-
3. **
|
|
66
|
-
4. **
|
|
67
|
-
5. **
|
|
68
|
-
6. **
|
|
69
|
-
|
|
65
|
+
1. **ALL analysis via exec_command → maestro delegate** — coordinator NEVER performs analysis internally, NEVER spawns agents for analysis
|
|
66
|
+
2. **exec_command is the execution mechanism** — every delegate call: `exec_command({ cmd: "maestro delegate ..." })`
|
|
67
|
+
3. **delegate-protocol.codex.md governs lifecycle** — MUST follow exec_command → poll write_stdin → parse for every delegate
|
|
68
|
+
4. **NEVER fire-and-forget** — every exec_command MUST be polled to completion via write_stdin, result consumed before proceeding
|
|
69
|
+
5. **NEVER substitute internal reasoning** — if CLI fails, report failure; do NOT generate analysis yourself as replacement
|
|
70
|
+
6. **Indefinite wait** — polling has NO max timeout; continue polling until CLI returns regardless of elapsed time; NEVER abandon a running session
|
|
71
|
+
6. **Same prompt, different --to** — fan-out delegates all use identical base prompt, only `--to <tool>` differs
|
|
72
|
+
7. **Minimum 2 tools** — abort if fewer eligible
|
|
73
|
+
8. **Partial degradation** — 1 tool fails → continue with remaining (minimum 2 results for cross-verify)
|
|
70
74
|
</invariants>
|
|
71
75
|
|
|
72
76
|
<state_machine>
|
|
73
77
|
|
|
74
78
|
<states>
|
|
75
|
-
S_PARSE
|
|
76
|
-
S_CONFIRM
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
S_AGGREGATE — 注册 artifact、输出摘要 PERSIST: state.json + results.csv
|
|
79
|
+
S_PARSE — 解析参数、发现工具 PERSIST: —
|
|
80
|
+
S_CONFIRM — 展示计划、用户确认(-y 跳过) PERSIST: —
|
|
81
|
+
S_FAN_OUT — 并行 exec_command fan-out + 轮询等待 PERSIST: per-tool outputs
|
|
82
|
+
S_CROSS_VERIFY — 交叉验证:共识/冲突/独特分类 PERSIST: cross-verify.md
|
|
83
|
+
S_SYNTHESIZE — 生成最终报告 PERSIST: reports
|
|
84
|
+
S_AGGREGATE — 注册 artifact、输出摘要 PERSIST: state.json
|
|
82
85
|
</states>
|
|
83
86
|
|
|
84
87
|
<transitions>
|
|
@@ -88,25 +91,22 @@ S_PARSE:
|
|
|
88
91
|
→ ERROR(E002) WHEN: eligible tools < 2
|
|
89
92
|
|
|
90
93
|
S_CONFIRM:
|
|
91
|
-
→
|
|
92
|
-
→ S_PARSE WHEN: user modifies tools
|
|
94
|
+
→ S_FAN_OUT WHEN: -y OR user confirms
|
|
95
|
+
→ S_PARSE WHEN: user modifies tools
|
|
93
96
|
→ END WHEN: user cancels
|
|
94
97
|
|
|
95
|
-
|
|
96
|
-
→
|
|
97
|
-
|
|
98
|
-
S_WAVE_1:
|
|
99
|
-
→ S_WAVE_2 WHEN: 1+ agents completed DO: A_SPAWN_WAVE_1
|
|
100
|
-
→ ERROR(E004) WHEN: all failed
|
|
98
|
+
S_FAN_OUT:
|
|
99
|
+
→ S_CROSS_VERIFY WHEN: 2+ delegates completed DO: A_FAN_OUT_DELEGATES
|
|
100
|
+
→ ERROR(E004) WHEN: all failed OR fewer than 2 completed
|
|
101
101
|
|
|
102
|
-
|
|
103
|
-
→
|
|
102
|
+
S_CROSS_VERIFY:
|
|
103
|
+
→ S_SYNTHESIZE DO: A_CROSS_VERIFY
|
|
104
104
|
|
|
105
|
-
|
|
106
|
-
→ S_AGGREGATE
|
|
105
|
+
S_SYNTHESIZE:
|
|
106
|
+
→ S_AGGREGATE DO: A_SYNTHESIZE
|
|
107
107
|
|
|
108
108
|
S_AGGREGATE:
|
|
109
|
-
→ END
|
|
109
|
+
→ END DO: A_AGGREGATE_RESULTS
|
|
110
110
|
|
|
111
111
|
</transitions>
|
|
112
112
|
|
|
@@ -114,111 +114,212 @@ S_AGGREGATE:
|
|
|
114
114
|
|
|
115
115
|
### A_PARSE_AND_DISCOVER
|
|
116
116
|
|
|
117
|
-
1. Parse flags: requirement, tools, mode, rule, autoYes
|
|
118
|
-
2. Read cli-tools.json → build eligible tool list
|
|
117
|
+
1. Parse flags: requirement, tools, mode, rule, autoYes
|
|
118
|
+
2. Read `~/.maestro/cli-tools.json` → build eligible tool list
|
|
119
119
|
3. Auto-select if no --tools: first 3 eligible in config order
|
|
120
|
-
4.
|
|
121
|
-
|
|
120
|
+
4. Build shared delegate prompt (6-field format):
|
|
121
|
+
```
|
|
122
|
+
PURPOSE: {requirement} + cross-verification analysis
|
|
123
|
+
TASK: {specific analysis tasks from requirement}
|
|
124
|
+
MODE: {mode}
|
|
125
|
+
CONTEXT: @**/* | {project context if available}
|
|
126
|
+
EXPECTED: Structured findings with evidence, confidence per dimension
|
|
127
|
+
CONSTRAINTS: {scope limits}
|
|
128
|
+
```
|
|
129
|
+
5. Create session + scratch dirs
|
|
130
|
+
6. `update_plan` with all phases pending
|
|
122
131
|
|
|
123
|
-
###
|
|
132
|
+
### A_FAN_OUT_DELEGATES
|
|
124
133
|
|
|
125
|
-
|
|
126
|
-
- Wave 1: one row per selected tool (parallel)
|
|
127
|
-
- Wave 2: cross-verify row (deps on all wave 1 IDs)
|
|
128
|
-
- Wave 3: synthesis row (deps on wave 2 ID)
|
|
134
|
+
#### Phase 1: Parallel Launch
|
|
129
135
|
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
Filter wave==1 from CSV → write wave-1.csv.
|
|
136
|
+
Launch ALL delegate commands simultaneously via `multi_tool_use.parallel`:
|
|
133
137
|
|
|
134
138
|
```
|
|
135
|
-
|
|
139
|
+
multi_tool_use.parallel({
|
|
140
|
+
tool_uses: [
|
|
141
|
+
{
|
|
142
|
+
recipient_name: "functions.exec_command",
|
|
143
|
+
parameters: {
|
|
144
|
+
cmd: "maestro delegate \"<shared_prompt>\" --to gemini --mode <mode> [--rule <rule>]",
|
|
145
|
+
yield_time_ms: 30000,
|
|
146
|
+
max_output_tokens: 6000
|
|
147
|
+
}
|
|
148
|
+
},
|
|
149
|
+
{
|
|
150
|
+
recipient_name: "functions.exec_command",
|
|
151
|
+
parameters: {
|
|
152
|
+
cmd: "maestro delegate \"<shared_prompt>\" --to claude --mode <mode> [--rule <rule>]",
|
|
153
|
+
yield_time_ms: 30000,
|
|
154
|
+
max_output_tokens: 6000
|
|
155
|
+
}
|
|
156
|
+
}
|
|
157
|
+
// ... one entry per selected tool
|
|
158
|
+
]
|
|
159
|
+
})
|
|
136
160
|
```
|
|
137
161
|
|
|
138
|
-
|
|
162
|
+
#### Phase 2: Block Until ALL Complete
|
|
139
163
|
|
|
140
|
-
|
|
164
|
+
For each result from Phase 1, check completion status:
|
|
141
165
|
|
|
142
|
-
|
|
166
|
+
- **Completed** (no session_id) → save output directly to `{scratchDir}/per-tool/{tool}-output.md`
|
|
167
|
+
- **Running** (session_id returned) → add to `pending_sessions[]`
|
|
143
168
|
|
|
144
|
-
|
|
169
|
+
**Blocking poll loop — runs until pending_sessions is empty:**
|
|
145
170
|
|
|
146
171
|
```
|
|
147
|
-
|
|
172
|
+
pending_sessions = [{ tool, session_id }, ...]
|
|
173
|
+
|
|
174
|
+
WHILE pending_sessions.length > 0:
|
|
175
|
+
FOR EACH session IN pending_sessions:
|
|
176
|
+
result = write_stdin({
|
|
177
|
+
session_id: session.session_id,
|
|
178
|
+
chars: "",
|
|
179
|
+
yield_time_ms: 60000, // 60s per poll — no rush, wait for real output
|
|
180
|
+
max_output_tokens: 6000
|
|
181
|
+
})
|
|
182
|
+
|
|
183
|
+
IF result indicates completed:
|
|
184
|
+
save output → {scratchDir}/per-tool/{session.tool}-output.md
|
|
185
|
+
REMOVE session FROM pending_sessions
|
|
186
|
+
completed_count += 1
|
|
187
|
+
|
|
188
|
+
IF result indicates failed/error:
|
|
189
|
+
log error for session.tool
|
|
190
|
+
REMOVE session FROM pending_sessions
|
|
191
|
+
failed_count += 1
|
|
192
|
+
|
|
193
|
+
// still running → stays in pending_sessions, poll again next round
|
|
148
194
|
```
|
|
149
195
|
|
|
150
|
-
**
|
|
196
|
+
**Blocking guarantees:**
|
|
197
|
+
- `yield_time_ms: 60000` — each poll waits up to 60s for output, no short-circuit
|
|
198
|
+
- NO max retry count — loop continues indefinitely until CLI returns
|
|
199
|
+
- NO timeout escalation — delegate can run as long as needed (30s to 10min+)
|
|
200
|
+
- NO early exit — even if tool 1 and 2 are done, keep polling tool 3 until it completes
|
|
201
|
+
- Round-robin ensures fair polling across all pending sessions
|
|
202
|
+
|
|
203
|
+
#### Phase 3: Validate
|
|
204
|
+
|
|
205
|
+
- Count completed tools
|
|
206
|
+
- completed < 2 → ERROR(E004)
|
|
207
|
+
- 1 tool failed but 2+ succeeded → W001, log failure, continue
|
|
208
|
+
|
|
209
|
+
**Iron rules**:
|
|
210
|
+
- NEVER skip polling — every session_id MUST be polled to completion
|
|
211
|
+
- NEVER proceed to S_CROSS_VERIFY while pending_sessions is non-empty
|
|
212
|
+
- NEVER set a max timeout or max retry count on the poll loop
|
|
213
|
+
- NEVER generate analysis internally as substitute for CLI output
|
|
214
|
+
- NEVER summarize or paraphrase — save raw CLI output verbatim
|
|
215
|
+
|
|
216
|
+
### A_CROSS_VERIFY
|
|
217
|
+
|
|
218
|
+
Coordinator reads ALL per-tool outputs from `{scratchDir}/per-tool/` and classifies each finding:
|
|
151
219
|
|
|
152
220
|
| Condition | Tag |
|
|
153
221
|
|-----------|-----|
|
|
154
|
-
| 2+ tools agree | CONSENSUS |
|
|
155
|
-
| Tools
|
|
156
|
-
| 1 tool
|
|
222
|
+
| 2+ tools agree on same finding | CONSENSUS |
|
|
223
|
+
| Tools have contradictory findings | CONFLICT |
|
|
224
|
+
| Only 1 tool identified | UNIQUE |
|
|
157
225
|
|
|
158
|
-
|
|
226
|
+
For each CONFLICT: note which tools disagree, their evidence, and confidence levels.
|
|
159
227
|
|
|
160
|
-
|
|
228
|
+
Compute: `consensus_level = consensus_count / total_findings * 100`
|
|
161
229
|
|
|
162
|
-
|
|
230
|
+
Write results to `{scratchDir}/cross-verify.md`.
|
|
163
231
|
|
|
164
|
-
|
|
232
|
+
### A_SYNTHESIZE
|
|
165
233
|
|
|
166
|
-
|
|
167
|
-
spawn_agents_on_csv({ csv_path: "wave-3.csv", max_concurrency: 1 })
|
|
168
|
-
```
|
|
234
|
+
Generate 3 output files from cross-verify results:
|
|
169
235
|
|
|
170
|
-
**
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
3. **conclusions.json**: session_id, subject, mode, tools[], consensus_level, recommendation (Go/No-Go/Conditional), confidence, dimensions[], decisions[]
|
|
236
|
+
1. **collab-report.md**:
|
|
237
|
+
```markdown
|
|
238
|
+
# Collaborative Analysis: {requirement}
|
|
174
239
|
|
|
175
|
-
|
|
240
|
+
## Summary
|
|
241
|
+
Tools: {tool list} | Consensus: {consensus_level}%
|
|
176
242
|
|
|
177
|
-
|
|
243
|
+
## Consensus Findings
|
|
244
|
+
{findings agreed by 2+ tools, with evidence}
|
|
178
245
|
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
3. Copy collab-report.md + context.md + conclusions.json → scratchDir
|
|
182
|
-
4. Register CLB artifact in state.json (type: collab, scope: adhoc)
|
|
183
|
-
5. Spec enrichment: for each Locked decision → `maestro spec add arch`
|
|
184
|
-
6. Display summary (requirement, tools, consensus_level, per-tool status, artifact ID, next steps)
|
|
246
|
+
## Conflicts
|
|
247
|
+
{contradictory findings with per-tool positions and evidence}
|
|
185
248
|
|
|
186
|
-
|
|
249
|
+
## Unique Insights
|
|
250
|
+
{single-tool findings worth noting}
|
|
187
251
|
|
|
188
|
-
|
|
252
|
+
## Recommendations
|
|
253
|
+
{actionable recommendations, prioritized}
|
|
189
254
|
|
|
190
|
-
|
|
255
|
+
## Per-Tool Confidence
|
|
256
|
+
| Tool | Confidence | Key Contribution |
|
|
257
|
+
|------|-----------|-----------------|
|
|
258
|
+
```
|
|
191
259
|
|
|
192
|
-
|
|
193
|
-
|------|-----------|------|
|
|
194
|
-
| cli_finding | tool+dimension | {tool, dimension, finding, confidence, evidence} |
|
|
195
|
-
| consensus | area | {area, tools[], finding, confidence} |
|
|
196
|
-
| conflict | area | {area, positions[{tool, stance, evidence}], resolution} |
|
|
197
|
-
| unique_insight | tool+finding | {tool, finding, significance, actionable} |
|
|
260
|
+
2. **context.md**: Locked (CONSENSUS) / Free (UNIQUE w/ strong evidence) / Deferred (CONFLICT unresolved)
|
|
198
261
|
|
|
199
|
-
|
|
200
|
-
|
|
262
|
+
3. **conclusions.json**:
|
|
263
|
+
```json
|
|
264
|
+
{
|
|
265
|
+
"session_id": "", "subject": "", "mode": "",
|
|
266
|
+
"tools": [], "consensus_level": 0,
|
|
267
|
+
"recommendation": "Go|No-Go|Conditional",
|
|
268
|
+
"confidence": 0,
|
|
269
|
+
"dimensions": [{ "name": "", "consensus": "", "details": "" }],
|
|
270
|
+
"decisions": [{ "area": "", "status": "locked|free|deferred", "rationale": "" }]
|
|
271
|
+
}
|
|
272
|
+
```
|
|
273
|
+
|
|
274
|
+
### A_AGGREGATE_RESULTS
|
|
275
|
+
|
|
276
|
+
1. Copy outputs to scratchDir
|
|
277
|
+
2. Register CLB artifact in state.json (type: collab, scope: adhoc)
|
|
278
|
+
3. Spec enrichment: for each Locked decision → `maestro spec add arch`
|
|
279
|
+
4. `update_plan` all steps completed
|
|
280
|
+
5. Display summary:
|
|
281
|
+
```
|
|
282
|
+
== Collab Analysis Complete ==
|
|
283
|
+
Requirement: {requirement}
|
|
284
|
+
Tools: {tool list with status}
|
|
285
|
+
Consensus Level: {consensus_level}%
|
|
286
|
+
|
|
287
|
+
Key Findings:
|
|
288
|
+
CONSENSUS: {count}
|
|
289
|
+
CONFLICT: {count}
|
|
290
|
+
UNIQUE: {count}
|
|
291
|
+
|
|
292
|
+
Reports: {scratchDir}/collab-report.md
|
|
293
|
+
Next: $maestro-plan --dir {scratchDir}
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
</actions>
|
|
297
|
+
|
|
298
|
+
</state_machine>
|
|
201
299
|
|
|
202
300
|
<error_codes>
|
|
203
301
|
| Code | Condition | Recovery |
|
|
204
302
|
|------|-----------|----------|
|
|
205
|
-
| E002 | Fewer than 2 eligible tools | Check cli-tools.json |
|
|
206
|
-
| E004 | All
|
|
207
|
-
| W001 | One tool failed
|
|
208
|
-
|
|
|
209
|
-
| W004 | consensus_level < 40% | Flag in summary |
|
|
303
|
+
| E002 | Fewer than 2 eligible tools | Check cli-tools.json, specify --tools |
|
|
304
|
+
| E004 | All delegates failed or < 2 completed | Show per-tool errors, abort |
|
|
305
|
+
| W001 | One tool failed | Continue with remaining |
|
|
306
|
+
| W004 | consensus_level < 40% | Flag in summary as low-confidence |
|
|
210
307
|
</error_codes>
|
|
211
308
|
|
|
212
309
|
<success_criteria>
|
|
213
|
-
- [ ]
|
|
214
|
-
- [ ]
|
|
215
|
-
- [ ]
|
|
216
|
-
- [ ]
|
|
217
|
-
- [ ]
|
|
310
|
+
- [ ] ALL analysis performed via exec_command → maestro delegate — zero internal analysis
|
|
311
|
+
- [ ] multi_tool_use.parallel used for fan-out launch
|
|
312
|
+
- [ ] Every exec_command polled to completion via write_stdin — no timeout cap, no max retries
|
|
313
|
+
- [ ] Blocking poll loop ran until pending_sessions empty — no early exit
|
|
314
|
+
- [ ] Per-tool raw outputs saved to {scratchDir}/per-tool/
|
|
315
|
+
- [ ] Cross-verify: CONSENSUS/CONFLICT/UNIQUE classified, consensus_level computed
|
|
316
|
+
- [ ] collab-report.md + context.md + conclusions.json produced
|
|
317
|
+
- [ ] CLB artifact registered in state.json
|
|
318
|
+
- [ ] Partial degradation: continued if 2+ tools succeeded
|
|
218
319
|
</success_criteria>
|
|
219
320
|
|
|
220
321
|
<next_step_routing>
|
|
221
322
|
- Deep feasibility analysis → `$maestro-analyze "{topic}"`
|
|
222
|
-
- Plan from conclusions → `$maestro-plan --dir {
|
|
323
|
+
- Plan from conclusions → `$maestro-plan --dir {scratchDir}`
|
|
223
324
|
- Expand exploration → `$maestro-brainstorm "{topic}"`
|
|
224
325
|
</next_step_routing>
|
|
@@ -94,12 +94,12 @@ When `--yes` or `-y`: Auto-confirm task breakdown, skip blocked-task prompts, au
|
|
|
94
94
|
### tasks.csv (Master State)
|
|
95
95
|
|
|
96
96
|
```csv
|
|
97
|
-
id,title,description,scope,convergence_criteria,hints,execution_directives,deps,context_from,wave
|
|
98
|
-
"TASK-001","Setup auth module","Create authentication module with JWT token generation and verification. Export verifyToken and generateToken functions.","src/auth/","auth.ts contains export function verifyToken(; auth.ts contains export function generateToken(","Reference existing middleware pattern in src/middleware/auth.ts","npm test -- --grep auth","","","1"
|
|
99
|
-
"TASK-002","Create user model","Define User interface and database schema with email, passwordHash, role fields. Use existing Result type pattern.","src/models/","user.ts contains export interface User; user.ts contains email: string","See src/models/session.ts for existing model pattern","npm test -- --grep user","","","1"
|
|
100
|
-
"TASK-003","Auth middleware","Create Express middleware that validates JWT from Authorization header. Use verifyToken from auth module. Return 401 on invalid token.","src/middleware/","auth-middleware.ts contains export function authMiddleware(; auth-middleware.ts contains verifyToken","Follows existing middleware pattern in src/middleware/logging.ts","npm test -- --grep middleware","TASK-001","TASK-001","2"
|
|
101
|
-
"TASK-004","Login endpoint","Implement POST /api/login endpoint. Validate credentials against user model, return JWT on success. Use generateToken from auth module.","src/routes/","login.ts contains router.post('/api/login'; login.ts contains generateToken(","Wire into existing Express app in src/app.ts","curl -X POST localhost:3000/api/login","TASK-001;TASK-002","TASK-001;TASK-002","2"
|
|
102
|
-
"TASK-005","Integration tests","Write integration tests for full auth flow: register, login, access protected route, token refresh.","tests/","tests/auth.test.ts exists; npm test exits with code 0","Use existing test setup in tests/setup.ts","npm test","TASK-003;TASK-004","TASK-003;TASK-004","3"
|
|
97
|
+
id,title,description,scope,convergence_criteria,hints,execution_directives,deps,context_from,wave
|
|
98
|
+
"TASK-001","Setup auth module","Create authentication module with JWT token generation and verification. Export verifyToken and generateToken functions.","src/auth/","auth.ts contains export function verifyToken(; auth.ts contains export function generateToken(","Reference existing middleware pattern in src/middleware/auth.ts","npm test -- --grep auth","","","1"
|
|
99
|
+
"TASK-002","Create user model","Define User interface and database schema with email, passwordHash, role fields. Use existing Result type pattern.","src/models/","user.ts contains export interface User; user.ts contains email: string","See src/models/session.ts for existing model pattern","npm test -- --grep user","","","1"
|
|
100
|
+
"TASK-003","Auth middleware","Create Express middleware that validates JWT from Authorization header. Use verifyToken from auth module. Return 401 on invalid token.","src/middleware/","auth-middleware.ts contains export function authMiddleware(; auth-middleware.ts contains verifyToken","Follows existing middleware pattern in src/middleware/logging.ts","npm test -- --grep middleware","TASK-001","TASK-001","2"
|
|
101
|
+
"TASK-004","Login endpoint","Implement POST /api/login endpoint. Validate credentials against user model, return JWT on success. Use generateToken from auth module.","src/routes/","login.ts contains router.post('/api/login'; login.ts contains generateToken(","Wire into existing Express app in src/app.ts","curl -X POST localhost:3000/api/login","TASK-001;TASK-002","TASK-001;TASK-002","2"
|
|
102
|
+
"TASK-005","Integration tests","Write integration tests for full auth flow: register, login, access protected route, token refresh.","tests/","tests/auth.test.ts exists; npm test exits with code 0","Use existing test setup in tests/setup.ts","npm test","TASK-003;TASK-004","TASK-003;TASK-004","3"
|
|
103
103
|
```
|
|
104
104
|
|
|
105
105
|
**Columns**:
|
|
@@ -116,12 +116,14 @@ id,title,description,scope,convergence_criteria,hints,execution_directives,deps,
|
|
|
116
116
|
| `deps` | Input | Semicolon-separated dependency task IDs |
|
|
117
117
|
| `context_from` | Input | Semicolon-separated task IDs whose findings this task needs |
|
|
118
118
|
| `wave` | Computed | Wave number from plan.json wave assignment |
|
|
119
|
-
| `status` | Output | `pending` -> `completed` / `failed` / `blocked` / `skipped` |
|
|
119
|
+
| `status` | Output | `pending` -> `completed` / `failed` / `blocked` / `skipped` (mapped from output_schema `result_status`) |
|
|
120
120
|
| `findings` | Output | Implementation notes and observations (max 500 chars) |
|
|
121
121
|
| `files_modified` | Output | Semicolon-separated list of created/modified files |
|
|
122
122
|
| `tests_passed` | Output | Test pass/fail status from execution_directives |
|
|
123
123
|
| `error` | Output | Error message if failed or blocked |
|
|
124
124
|
|
|
125
|
+
**Column separation rule**: Wave CSV (input to spawn_agents_on_csv) and output_schema MUST NOT share column names. Wave CSV only contains Input columns + prev_context. Output columns are returned exclusively via output_schema (using `result_status`, not `status`). During merge, `result_status` maps back to the master CSV's `status` column.
|
|
126
|
+
|
|
125
127
|
### Per-Wave CSV (Temporary)
|
|
126
128
|
|
|
127
129
|
Each wave generates `wave-{N}.csv` with extra `prev_context` column populated from predecessor task findings.
|
|
@@ -132,7 +134,7 @@ Each wave generates `wave-{N}.csv` with extra `prev_context` column populated fr
|
|
|
132
134
|
|------|---------|-----------|
|
|
133
135
|
| `tasks.csv` | Master state -- all tasks with status/findings | Updated after each wave |
|
|
134
136
|
| `wave-{N}.csv` | Per-wave input (temporary) | Created before wave, deleted after |
|
|
135
|
-
| `wave-{N}-results.csv` | Per-wave output | Created by spawn_agents_on_csv |
|
|
137
|
+
| `wave-{N}-results.csv` | Per-wave output (uses `result_status`) | Created by spawn_agents_on_csv, deleted after merge |
|
|
136
138
|
| `results.csv` | Final export of all task results | Created in Phase 3 |
|
|
137
139
|
| `discoveries.ndjson` | Shared exploration board | Append-only, carries across waves |
|
|
138
140
|
| `context.md` | Human-readable execution report | Created in Phase 3 |
|
|
@@ -158,7 +160,7 @@ Each wave generates `wave-{N}.csv` with extra `prev_context` column populated fr
|
|
|
158
160
|
4. **Context Propagation**: prev_context built from master CSV findings, not from memory
|
|
159
161
|
5. **Discovery Board is Append-Only**: Never clear, modify, or recreate discoveries.ndjson
|
|
160
162
|
6. **Cascading Skip on Failure**: If a task fails/blocks, all dependent tasks are skipped
|
|
161
|
-
7. **Cleanup Temp Files**: Remove wave-{N}.csv after results are merged
|
|
163
|
+
7. **Cleanup Temp Files**: Remove `wave-{N}.csv` AND `wave-{N}-results.csv` after results are merged
|
|
162
164
|
8. **Max 3 Fix Attempts**: Per task, auto-fix convergence failures up to 3 times, then mark blocked
|
|
163
165
|
9. **Breakpoint Resume**: Always detect completed tasks and skip them on re-run
|
|
164
166
|
10. **DO NOT STOP**: Continuous execution until all waves complete or user explicitly stops
|
|
@@ -251,11 +253,11 @@ spawn_agents_on_csv({
|
|
|
251
253
|
instruction: buildExecutorInstruction(sessionFolder, phaseDir, autoCommit, specsContent), // agent: ~/.codex/agents/workflow-executor.toml
|
|
252
254
|
max_concurrency: maxConcurrency, max_runtime_seconds: 3600,
|
|
253
255
|
output_csv_path: `${sessionFolder}/wave-${N}-results.csv`,
|
|
254
|
-
output_schema: { id,
|
|
256
|
+
output_schema: { id, result_status: [completed|failed|blocked], findings, files_modified, tests_passed, error }
|
|
255
257
|
})
|
|
256
258
|
```
|
|
257
259
|
|
|
258
|
-
4. Merge results into master `tasks.csv
|
|
260
|
+
4. Merge results into master `tasks.csv`: map `result_status` from `wave-{N}-results.csv` to the `status` column in master CSV. Delete `wave-{N}.csv` AND `wave-{N}-results.csv` after merge.
|
|
259
261
|
|
|
260
262
|
#### Blocked Task Handling
|
|
261
263
|
|
|
@@ -28,9 +28,9 @@ $maestro-milestone-audit "M1"
|
|
|
28
28
|
### tasks.csv (Master State)
|
|
29
29
|
|
|
30
30
|
```csv
|
|
31
|
-
id,title,description,scope,check_targets,deps,wave
|
|
32
|
-
"integ-1","Interface & dependency chains","Verify shared interfaces are consistent across phases: re-exports match, dependency chains unbroken, no circular imports between phase outputs","cross-phase imports, shared types, re-exports","grep for shared type names across phase output dirs; verify export/import consistency","","1"
|
|
33
|
-
"integ-2","Data contracts & API consistency","Verify request/response schemas match across phases: API signatures consistent, error codes aligned, no contract drift","request/response schemas, API signatures, error codes","diff API type definitions across phases; check error code enum consistency","","1"
|
|
31
|
+
id,title,description,scope,check_targets,deps,wave
|
|
32
|
+
"integ-1","Interface & dependency chains","Verify shared interfaces are consistent across phases: re-exports match, dependency chains unbroken, no circular imports between phase outputs","cross-phase imports, shared types, re-exports","grep for shared type names across phase output dirs; verify export/import consistency","","1"
|
|
33
|
+
"integ-2","Data contracts & API consistency","Verify request/response schemas match across phases: API signatures consistent, error codes aligned, no contract drift","request/response schemas, API signatures, error codes","diff API type definitions across phases; check error code enum consistency","","1"
|
|
34
34
|
```
|
|
35
35
|
|
|
36
36
|
**Columns**:
|
|
@@ -44,19 +44,21 @@ id,title,description,scope,check_targets,deps,wave,status,findings,gaps_found,se
|
|
|
44
44
|
| `check_targets` | Input | Specific verification commands/grep patterns |
|
|
45
45
|
| `deps` | Input | Dependencies (empty — all wave 1) |
|
|
46
46
|
| `wave` | Computed | Wave number (always 1 — single parallel wave) |
|
|
47
|
-
| `
|
|
47
|
+
| `result_status` | Output | `pass` / `fail` / `warning` |
|
|
48
48
|
| `findings` | Output | Detailed findings per dimension (max 500 chars) |
|
|
49
49
|
| `gaps_found` | Output | Semicolon-separated list of integration gaps |
|
|
50
50
|
| `severity` | Output | `critical` / `warning` / `info` per gap |
|
|
51
51
|
| `error` | Output | Error message if check failed |
|
|
52
52
|
|
|
53
|
+
**Column separation rule**: Input columns and Output columns MUST NOT share names. Wave CSV only contains Input columns. Output columns are returned exclusively via output_schema.
|
|
54
|
+
|
|
53
55
|
### Session Structure
|
|
54
56
|
|
|
55
57
|
```
|
|
56
58
|
.workflow/.csv-wave/{YYYYMMDD}-audit-{milestone}/
|
|
57
59
|
+-- tasks.csv
|
|
58
|
-
+-- wave-1.csv (temporary)
|
|
59
|
-
+-- wave-1-results.csv
|
|
60
|
+
+-- wave-1.csv (temporary, deleted after merge)
|
|
61
|
+
+-- wave-1-results.csv (temporary, deleted after merge)
|
|
60
62
|
```
|
|
61
63
|
</csv_schema>
|
|
62
64
|
|
|
@@ -95,16 +97,16 @@ Verify all adhoc-scoped artifacts completed. For each execute artifact, verify a
|
|
|
95
97
|
spawn_agents_on_csv({
|
|
96
98
|
csv_path: `${sessionFolder}/wave-1.csv`,
|
|
97
99
|
id_column: "id",
|
|
98
|
-
instruction: `You are an integration checker for milestone ${milestone}. For each row, examine the scope and check_targets. Search the codebase for inconsistencies, contract drift, and broken dependencies across phase outputs. Report findings with file:line references. Set
|
|
100
|
+
instruction: `You are an integration checker for milestone ${milestone}. For each row, examine the scope and check_targets. Search the codebase for inconsistencies, contract drift, and broken dependencies across phase outputs. Report findings with file:line references. Set result_status to pass/fail/warning. List specific gaps in gaps_found (semicolon-separated).`,
|
|
99
101
|
max_concurrency: 2, max_runtime_seconds: 600,
|
|
100
102
|
output_csv_path: `${sessionFolder}/wave-1-results.csv`,
|
|
101
|
-
output_schema: { id,
|
|
103
|
+
output_schema: { id, result_status: [pass|fail|warning], findings, gaps_found, severity, error }
|
|
102
104
|
})
|
|
103
105
|
```
|
|
104
106
|
|
|
105
|
-
4. Merge results into master `tasks.csv`
|
|
107
|
+
4. Merge results into master `tasks.csv`: map `result_status` → master `status` column, copy `findings`, `gaps_found`, `severity`, `error`. Delete temporary files (`wave-1.csv`, `wave-1-results.csv`) after merge.
|
|
106
108
|
5. Parse `gaps_found` from all workers — aggregate into `.workflow/milestones/{milestone}/audit-report.md`
|
|
107
|
-
6. Any worker with `
|
|
109
|
+
6. Any worker with `result_status == fail` and `severity == critical` → milestone verdict = FAIL
|
|
108
110
|
|
|
109
111
|
### Step 6: Verdict
|
|
110
112
|
|
|
@@ -99,8 +99,8 @@ S_BUILD_CHAIN:
|
|
|
99
99
|
-> S_CREATE_SESSION DO: A_BUILD_STEPS
|
|
100
100
|
|
|
101
101
|
S_CREATE_SESSION:
|
|
102
|
-
-> S_CONFIRM WHEN: not auto_mode
|
|
103
|
-
-> S_LOAD_NEXT WHEN: auto_mode
|
|
102
|
+
-> S_CONFIRM WHEN: not auto_mode DO: A_CREATE_SESSION
|
|
103
|
+
-> S_LOAD_NEXT WHEN: auto_mode DO: A_CREATE_SESSION
|
|
104
104
|
|
|
105
105
|
S_CONFIRM:
|
|
106
106
|
-> S_LOAD_NEXT WHEN: "Proceed"
|
|
@@ -220,6 +220,13 @@ Priority: regex from intent `phase\s*(\d+)` -> latest in-progress artifact's pha
|
|
|
220
220
|
6. Args use placeholders `{phase}`, `{intent}`, `{dirs}` — resolved at wave execution time
|
|
221
221
|
7. Append `-y` to all skill args when `auto_mode` is true (see -y propagation table in context)
|
|
222
222
|
|
|
223
|
+
### A_CREATE_SESSION
|
|
224
|
+
|
|
225
|
+
1. Write `.workflow/.maestro/ralph-{YYYYMMDD-HHmmss}/status.json` (see Session JSON Schema)
|
|
226
|
+
2. Initialize tracking:
|
|
227
|
+
- `create_goal({ objective: "Ralph lifecycle: {quality_mode} mode, {N} steps from {lifecycle_position}" })`
|
|
228
|
+
- `update_plan({ plan: steps.map(step => { step, status: "pending" }) })`
|
|
229
|
+
|
|
223
230
|
### A_BUILD_AND_SPAWN_WAVE
|
|
224
231
|
|
|
225
232
|
1. Conditional step eval: check_coverage -> read validation.json, skip if >= threshold
|
|
@@ -255,11 +262,16 @@ Update session: milestone, phase, reset passed_gates. Re-infer quality_mode. Bui
|
|
|
255
262
|
|
|
256
263
|
### A_FINALIZE
|
|
257
264
|
|
|
258
|
-
Set status = "completed"
|
|
265
|
+
1. Set `session.status = "completed"`, write status.json
|
|
266
|
+
2. Sync update_plan: all steps → "completed"
|
|
267
|
+
3. `update_goal({ status: "complete" })` — release goal constraint
|
|
268
|
+
4. Display completion report
|
|
259
269
|
|
|
260
270
|
### A_PAUSE_SESSION
|
|
261
271
|
|
|
262
|
-
Set status = "paused"
|
|
272
|
+
1. Set `session.status = "paused"`, write status.json
|
|
273
|
+
2. Do NOT call `update_goal` — goal stays for `execute`/`continue` resume
|
|
274
|
+
3. Display: use `$maestro-ralph execute` to continue
|
|
263
275
|
|
|
264
276
|
</actions>
|
|
265
277
|
|