maestro-flow 0.3.40 → 0.3.41

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,333 @@
1
+ ---
2
+ name: maestro-collab
3
+ description: Multi-CLI collaborative analysis -- fan-out to multiple CLI tools, cross-verify, synthesize
4
+ argument-hint: "\"<requirement>\" [--tools gemini,qwen,claude] [--mode analysis|write] [--rule <template>] [-y]"
5
+ allowed-tools:
6
+ - Read
7
+ - Write
8
+ - Bash
9
+ - Glob
10
+ - Grep
11
+ - Agent
12
+ - AskUserQuestion
13
+ ---
14
+
15
+ <purpose>
16
+ Multi-CLI collaboration: fan-out the same requirement to multiple CLI tools in parallel, cross-verify outputs for consensus/conflicts, then synthesize into a unified report with standard downstream artifacts (context.md + conclusions.json).
17
+
18
+ Each CLI tool independently analyzes the requirement. Results are compared and merged via evidence-weighted synthesis.
19
+ </purpose>
20
+
21
+ <context>
22
+ $ARGUMENTS — requirement text and optional flags.
23
+
24
+ ```bash
25
+ /maestro-collab "analyze the auth module for security vulnerabilities"
26
+ /maestro-collab "design a caching strategy" --tools gemini,qwen,claude
27
+ /maestro-collab -y "review error handling patterns"
28
+ /maestro-collab "refactor user service" --mode write --tools gemini,claude
29
+ ```
30
+
31
+ **Flags**:
32
+ - `--tools <list>`: Comma-separated CLI tools (default: auto-select first 3 enabled)
33
+ - `--mode analysis|write`: Delegate mode (default: analysis)
34
+ - `--rule <template>`: Shared rule template for all delegates
35
+ - `-y` / `--yes`: Skip plan confirmation
36
+
37
+ **Output**: `.workflow/scratch/{YYYYMMDD}-collab-{slug}/`
38
+ - `collab-report.md` — full collaboration report
39
+ - `context.md` — standard Locked/Free/Deferred decisions (plan/analyze compatible)
40
+ - `conclusions.json` — structured conclusions (plan fast-track compatible)
41
+ - `per-tool/{tool}-output.md` — raw per-tool outputs
42
+ </context>
43
+
44
+ <execution>
45
+
46
+ ### Step 1: Parse Arguments
47
+
48
+ Extract from `$ARGUMENTS`:
49
+ - `requirement` — remaining text after flag removal (error if empty)
50
+ - `--tools` → `selectedTools` (comma-split)
51
+ - `--mode` → `delegateMode` (default: `analysis`)
52
+ - `--rule` → `ruleTemplate`
53
+ - `-y` / `--yes` → `autoYes`
54
+
55
+ ### Step 2: Discover Available CLI Tools
56
+
57
+ ```bash
58
+ Bash("maestro tools list --json 2>/dev/null || cat ~/.maestro/cli-tools.json")
59
+ ```
60
+
61
+ Parse tool entries. Build eligible list:
62
+ - `enabled == true`
63
+ - If `--mode write`: exclude `type == "api-endpoint"`
64
+
65
+ Auto-select (when `--tools` omitted): first 3 eligible in config order.
66
+ Validate: minimum 2 eligible tools (abort if fewer).
67
+
68
+ ### Step 3: Present Collaboration Plan
69
+
70
+ **(Skip if `-y`)**
71
+
72
+ Display plan, then ask user:
73
+
74
+ ```
75
+ ============================================================
76
+ COLLABORATION PLAN
77
+ ============================================================
78
+ Requirement: {requirement}
79
+ Mode: {delegateMode}
80
+ Rule: {ruleTemplate || "none"}
81
+
82
+ Available CLI Tools (from cli-tools.json):
83
+ [✓] gemini — gemini-3.1-pro-preview [fullstack, frontend]
84
+ [✓] claude — claude-sonnet-4-6 [fullstack]
85
+ [✓] codex — gpt-5.5 [fullstack, backend]
86
+ [ ] opencode — (no model) [fullstack]
87
+
88
+ Selected: gemini, claude, codex (3 tools)
89
+
90
+ Pipeline:
91
+ 1. Fan-out → parallel delegate to each tool
92
+ 2. Cross-verification → consensus/conflict analysis
93
+ 3. Synthesis → context.md + conclusions.json
94
+ ============================================================
95
+ ```
96
+
97
+ Use `AskUserQuestion` with options:
98
+ - **执行** — proceed with selected tools
99
+ - **修改工具选择** — let user specify different tool combination
100
+ - **取消** — abort
101
+
102
+ If **修改工具选择**: ask user which tools to use (show eligible list), validate ≥ 2, re-display plan.
103
+
104
+ ### Step 4: Setup Session
105
+
106
+ ```
107
+ slug = requirement kebab-cased, max 40 chars
108
+ outputDir = .workflow/scratch/{YYYYMMDD}-collab-{slug}/
109
+ ```
110
+
111
+ Create `outputDir` + `outputDir/per-tool/`.
112
+
113
+ ### Step 5: Build Delegate Prompt
114
+
115
+ Shared prompt for all tools:
116
+
117
+ ```
118
+ PURPOSE: {requirement}; success = actionable findings with evidence
119
+ TASK: {auto-decomposed into 3-5 specific verbs}
120
+ MODE: {delegateMode}
121
+ CONTEXT: @**/*
122
+ EXPECTED: Structured findings with file:line references, confidence score (0-100), prioritized recommendations. Sections: ## Findings, ## Recommendations, ## Confidence
123
+ CONSTRAINTS: {extracted from requirement}
124
+ ```
125
+
126
+ ### Step 6: Parallel Fan-Out
127
+
128
+ Launch ALL delegate calls simultaneously using multiple `Bash(run_in_background: true)` in a **single message**:
129
+
130
+ ```
131
+ // Launch all in ONE message — do NOT wait between calls
132
+ Bash({
133
+ command: `maestro delegate "${prompt}" --to gemini --mode ${mode} ${rule}`,
134
+ run_in_background: true
135
+ })
136
+ Bash({
137
+ command: `maestro delegate "${prompt}" --to claude --mode ${mode} ${rule}`,
138
+ run_in_background: true
139
+ })
140
+ Bash({
141
+ command: `maestro delegate "${prompt}" --to codex --mode ${mode} ${rule}`,
142
+ run_in_background: true
143
+ })
144
+ ```
145
+
146
+ **After launching all calls → STOP immediately. Do not output anything. Wait for background completion callbacks.**
147
+
148
+ ### Step 7: Collect Results
149
+
150
+ As each background callback arrives:
151
+ 1. Extract exec ID from output (`[MAESTRO_EXEC_ID=...]`)
152
+ 2. Run `maestro delegate output <id>` to get full result
153
+ 3. Write raw output to `per-tool/{tool}-output.md`
154
+
155
+ **Wait until ALL callbacks have arrived before proceeding.**
156
+
157
+ ### Step 8: Cross-Verify
158
+
159
+ Read all `per-tool/{tool}-output.md` files. Compare findings across tools:
160
+
161
+ For each finding, classify:
162
+ - **[CONSENSUS]**: 2+ tools agree on same finding/recommendation
163
+ - **[CONFLICT]**: Tools disagree on approach or assessment
164
+ - **[UNIQUE]**: Finding from only one tool
165
+
166
+ Compute `consensus_level = (consensus_count / total_findings) * 100`.
167
+
168
+ ### Step 9: Synthesize Outputs
169
+
170
+ Resolve conflicts via evidence-weighted voting:
171
+ - Higher confidence tool's position wins
172
+ - More specific evidence (file:line refs) wins over general statements
173
+ - If tied: mark as `[SUGGESTED]`
174
+
175
+ Generate three output files:
176
+
177
+ #### collab-report.md
178
+
179
+ ```markdown
180
+ # Multi-CLI Collaboration Report — {requirement}
181
+
182
+ ## Summary
183
+ - Tools: {tool_list}
184
+ - Consensus level: {N}%
185
+ - Key finding: {top finding}
186
+
187
+ ## Consensus Findings
188
+ {findings agreed by 2+ tools}
189
+
190
+ ## Resolved Conflicts
191
+ {conflicts resolved with rationale and winning tool}
192
+
193
+ ## Unresolved Items
194
+ {items requiring human judgment}
195
+
196
+ ## Unique Insights
197
+ {valuable unique findings with source tool attribution}
198
+
199
+ ## Recommendations
200
+ {prioritized, merged recommendations}
201
+
202
+ ## Per-Tool Confidence
203
+ | Tool | Confidence | Key Strength |
204
+ |------|-----------|--------------|
205
+ ```
206
+
207
+ #### context.md (standard downstream format)
208
+
209
+ ```markdown
210
+ # Context: {requirement}
211
+
212
+ **Date**: {date}
213
+ **Mode**: collab ({tool_list})
214
+ **Consensus Level**: {N}%
215
+
216
+ ## Decisions
217
+
218
+ ### Decision N: {TITLE}
219
+ - **Context**: {what and why}
220
+ - **Options**: 1. {opt1} 2. {opt2}
221
+ - **Chosen**: {selected}
222
+ - **Reason**: {rationale — which tools agreed/disagreed}
223
+
224
+ ## Constraints
225
+
226
+ ### Locked
227
+ {[CONSENSUS] items — treat as confirmed decisions}
228
+
229
+ ### Free
230
+ {[UNIQUE] items with strong evidence — implementer discretion}
231
+
232
+ ### Deferred
233
+ {[UNRESOLVED] conflicts — require human judgment}
234
+
235
+ ## Code Context
236
+ {file:line references from per-tool findings}
237
+ ```
238
+
239
+ #### conclusions.json
240
+
241
+ ```json
242
+ {
243
+ "session_id": "{sessionId}",
244
+ "subject": "{requirement}",
245
+ "mode": "collab",
246
+ "tools": ["gemini", "claude", "codex"],
247
+ "consensus_level": 85,
248
+ "recommendation": "Go|No-Go|Conditional",
249
+ "confidence": "high|medium|low",
250
+ "dimensions": [
251
+ { "name": "gemini", "score": 80, "findings": "...", "recommendations": "..." }
252
+ ],
253
+ "decisions": [
254
+ { "title": "...", "classification": "locked|free|deferred", "source_tools": [], "rationale": "..." }
255
+ ],
256
+ "timestamp": "<ISO>"
257
+ }
258
+ ```
259
+
260
+ ### Step 10: Register Artifact
261
+
262
+ Append to `.workflow/state.json`:
263
+
264
+ ```json
265
+ {
266
+ "id": "CLB-{next_id}",
267
+ "type": "collab",
268
+ "milestone": "{current_milestone}",
269
+ "phase": null,
270
+ "scope": "adhoc",
271
+ "path": "scratch/{YYYYMMDD}-collab-{slug}",
272
+ "status": "completed",
273
+ "depends_on": null,
274
+ "harvested": false,
275
+ "created_at": "<ISO>",
276
+ "completed_at": "<ISO>"
277
+ }
278
+ ```
279
+
280
+ ### Step 11: Display Summary
281
+
282
+ ```
283
+ ============================================================
284
+ MULTI-CLI COLLABORATION COMPLETE
285
+ ============================================================
286
+ Requirement: {requirement}
287
+ Tools: {tool_list}
288
+ Consensus Level: {N}%
289
+
290
+ Per-Tool:
291
+ gemini: completed (confidence: {N}%)
292
+ claude: completed (confidence: {N}%)
293
+ codex: completed (confidence: {N}%)
294
+
295
+ Artifact: CLB-{id}
296
+ Output: {outputDir}/
297
+
298
+ Next steps:
299
+ /maestro-analyze "{topic}" — Deep feasibility analysis
300
+ /maestro-plan "{phase} --dir {dir}" — Plan from collab conclusions
301
+ /maestro-brainstorm "{topic}" — Expand with multi-role brainstorm
302
+ ============================================================
303
+ ```
304
+
305
+ </execution>
306
+
307
+ <error_codes>
308
+
309
+ | Code | Severity | Condition | Recovery |
310
+ |------|----------|-----------|----------|
311
+ | E001 | error | Requirement argument missing | Prompt for requirement |
312
+ | E002 | error | Fewer than 2 CLI tools eligible | Check cli-tools.json, enable more tools |
313
+ | E003 | error | Specified tool not found/enabled | Show available tools |
314
+ | E004 | error | All delegates failed | Abort with per-tool error details |
315
+ | W001 | warning | One tool failed | Continue with remaining tools |
316
+ | W002 | warning | >50% conflicts in cross-verify | Highlight in report, recommend manual review |
317
+ | W003 | warning | Low consensus level (<40%) | Flag in summary |
318
+
319
+ </error_codes>
320
+
321
+ <success_criteria>
322
+ - [ ] Available tools discovered from cli-tools.json with eligibility filtering
323
+ - [ ] Plan presented via AskUserQuestion with tool modification option (unless -y)
324
+ - [ ] All delegates launched in parallel via Bash(run_in_background: true)
325
+ - [ ] Execution stopped after launch — waited for all callbacks
326
+ - [ ] Per-tool outputs written to per-tool/{tool}-output.md
327
+ - [ ] Cross-verification: consensus/conflict/unique classification complete
328
+ - [ ] collab-report.md produced with merged findings
329
+ - [ ] context.md produced in Locked/Free/Deferred format (downstream compatible)
330
+ - [ ] conclusions.json produced (plan fast-track compatible)
331
+ - [ ] CLB artifact registered in state.json
332
+ - [ ] Partial degradation: continued if 1+ tools succeeded
333
+ </success_criteria>
@@ -139,9 +139,15 @@ After each barrier skill completes, read its artifacts and update `state.context
139
139
  }
140
140
  ```
141
141
 
142
- 7. **Initialize plan tracking** (dual-track: status.json + update_plan):
142
+ 7. **Initialize tracking** (goal constraint plan sub-items):
143
143
 
144
144
  ```
145
+ // Goal = outer constraint — ensures entire chain completes
146
+ functions.create_goal({
147
+ objective: `Maestro ${chain_name}: ${steps.length} steps [${steps.map(s => s.skill).join(' → ')}]`
148
+ })
149
+
150
+ // Plan = inner tracking — sub-step progress
145
151
  functions.update_plan({
146
152
  plan: steps.map((step, i) => ({
147
153
  id: `step-${i}`,
@@ -233,9 +239,12 @@ Object with all fields required: `status` ("completed"|"failed"), `skill_call` (
233
239
 
234
240
  ### Phase 3: Completion Report
235
241
 
236
- Finalize dual tracking:
242
+ Finalize tracking:
237
243
  - status.json: `state.status = 'completed'`
238
244
  - update_plan: all steps → `"completed"` (skipped steps also marked completed)
245
+ - **update_goal**: `functions.update_goal({ status: "complete" })` — release goal constraint
246
+
247
+ **Note**: Abort path (Phase 2 step 7) does NOT call `update_goal` — goal stays running for `--continue` resume.
239
248
 
240
249
  ```
241
250
  === COORDINATE COMPLETE ===
@@ -0,0 +1,631 @@
1
+ ---
2
+ name: maestro-collab
3
+ description: Multi-CLI collaborative analysis -- fan-out to multiple CLI tools, cross-verify, synthesize
4
+ argument-hint: "\"<requirement>\" [--tools gemini,qwen,claude] [--mode analysis|write] [--rule <template>] [-y]"
5
+ allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
6
+ ---
7
+
8
+ <purpose>
9
+ Wave-based multi-CLI collaboration using `spawn_agents_on_csv`. Diamond topology: parallel CLI fan-out (Wave 1), cross-verification (Wave 2), then unified synthesis (Wave 3).
10
+
11
+ Each CLI tool independently analyzes the same requirement from its own perspective. Results are cross-verified for conflicts, then synthesized into a single actionable output.
12
+
13
+ **Core workflow**: Parse Requirement -> CLI Fan-Out -> Cross-Verify -> Synthesize
14
+
15
+ ```
16
+ +---------------------------------------------------------------------------+
17
+ | COLLAB CSV WAVE WORKFLOW |
18
+ +---------------------------------------------------------------------------+
19
+ | |
20
+ | Phase 1: Requirement Resolution -> CSV |
21
+ | +-- Parse requirement and flags from arguments |
22
+ | +-- Select CLI tools (explicit --tools or auto-select) |
23
+ | +-- Load project context (project.md, specs, codebase) |
24
+ | +-- Generate tasks.csv with fan-out + verify + synthesis rows |
25
+ | +-- User validates tool selection (skip if -y) |
26
+ | |
27
+ | Phase 2: Wave Execution Engine |
28
+ | +-- Wave 1: CLI Fan-Out (parallel, 2-5 agents) |
29
+ | | +-- Each agent delegates to one CLI tool via exec_command |
30
+ | | +-- Same requirement, different CLI perspective |
31
+ | | +-- Results: per-tool findings + recommendations |
32
+ | +-- Wave 2: Cross-Verification (single agent) |
33
+ | | +-- Compare all CLI outputs for consensus/conflicts |
34
+ | | +-- Tag: [CONSENSUS] / [CONFLICT] / [UNIQUE] |
35
+ | | +-- Results: conflict matrix + agreement areas |
36
+ | +-- Wave 3: Synthesis (single agent) |
37
+ | | +-- Merge verified findings into actionable output |
38
+ | | +-- Resolve conflicts with evidence-weighted voting |
39
+ | | +-- Generate final collab-report.md |
40
+ | +-- discoveries.ndjson shared across all waves (append-only) |
41
+ | |
42
+ | Phase 3: Results Aggregation |
43
+ | +-- Export results.csv + collab-report.md |
44
+ | +-- Display summary with consensus level + next steps |
45
+ | |
46
+ +---------------------------------------------------------------------------+
47
+ ```
48
+
49
+ </purpose>
50
+
51
+ <context>
52
+ ```bash
53
+ $maestro-collab "analyze the auth module for security vulnerabilities"
54
+ $maestro-collab "design a caching strategy for the API layer" --tools gemini,qwen,claude
55
+ $maestro-collab -y "review error handling patterns across the codebase"
56
+ $maestro-collab "refactor user service to use repository pattern" --mode write --tools gemini,claude
57
+ ```
58
+
59
+ **Flags**:
60
+ - `--tools <list>`: Comma-separated CLI tools (default: auto-select top 3 enabled from cli-tools.json)
61
+ - `--mode analysis|write`: Delegate mode (default: analysis)
62
+ - `--rule <template>`: Shared rule template for all delegates
63
+ - `-y, --yes`: Skip all confirmations (auto mode)
64
+ - `-c, --concurrency N`: Max concurrent agents within each wave (default: 5)
65
+
66
+ **Auto-select logic** (when `--tools` omitted):
67
+ 1. Read `~/.maestro/cli-tools.json`
68
+ 2. Filter `enabled == true`
69
+ 3. Take first 3 tools in config order
70
+ 4. Exclude `api-endpoint` type tools when `--mode write`
71
+
72
+ **Output Directory**: `.workflow/.csv-wave/{session-id}/`
73
+ **Core Output**: `tasks.csv` + `results.csv` + `discoveries.ndjson` + `collab-report.md`
74
+ </context>
75
+
76
+ <csv_schema>
77
+
78
+ ### tasks.csv (Master State)
79
+
80
+ ```csv
81
+ id,title,description,tool,role,prompt,mode,rule,deps,context_from,wave,status,findings,recommendations,confidence,error
82
+ "1","CLI: gemini","Analyze requirement via gemini CLI","gemini","analyze","<full prompt>","analysis","","","","1","","","","",""
83
+ "2","CLI: qwen","Analyze requirement via qwen CLI","qwen","analyze","<full prompt>","analysis","","","","1","","","","",""
84
+ "3","CLI: claude","Analyze requirement via claude CLI","claude","analyze","<full prompt>","analysis","","","","1","","","","",""
85
+ "4","Cross-Verify","Compare all CLI outputs: tag consensus, conflicts, unique findings","","","","","","1;2;3","1;2;3","2","","","","",""
86
+ "5","Synthesis","Merge verified findings into actionable collab-report.md","","","","","","4","4","3","","","","",""
87
+ ```
88
+
89
+ **Columns**:
90
+
91
+ | Column | Phase | Description |
92
+ |--------|-------|-------------|
93
+ | `id` | Input | Unique task identifier |
94
+ | `title` | Input | Short task title |
95
+ | `description` | Input | Detailed instructions for this task |
96
+ | `tool` | Input | CLI tool name (wave 1 only) |
97
+ | `role` | Input | Delegate --role value |
98
+ | `prompt` | Input | Full 6-field prompt for delegate |
99
+ | `mode` | Input | analysis or write |
100
+ | `rule` | Input | --rule template name (optional) |
101
+ | `deps` | Input | Semicolon-separated dependency task IDs |
102
+ | `context_from` | Input | Semicolon-separated task IDs for prev_context |
103
+ | `wave` | Computed | Wave number (1=fan-out, 2=verify, 3=synthesis) |
104
+ | `status` | Output | pending -> completed / failed |
105
+ | `findings` | Output | Key findings summary (max 500 chars) |
106
+ | `recommendations` | Output | Per-tool recommendations |
107
+ | `confidence` | Output | Self-assessed confidence (0-100) |
108
+ | `error` | Output | Error message if failed |
109
+
110
+ ### Session Structure
111
+
112
+ ```
113
+ .workflow/.csv-wave/{YYYYMMDD}-collab-{slug}/
114
+ +-- tasks.csv
115
+ +-- results.csv
116
+ +-- discoveries.ndjson
117
+ +-- collab-report.md
118
+ +-- context.md ← standard Locked/Free/Deferred format (downstream compatible)
119
+ +-- conclusions.json ← structured conclusions (plan fast-track compatible)
120
+ +-- wave-{N}.csv (temporary)
121
+ +-- per-tool/
122
+ +-- gemini-output.md
123
+ +-- qwen-output.md
124
+ +-- claude-output.md
125
+ ```
126
+
127
+ ### Downstream Compatibility
128
+
129
+ | Consumer | Consumption Path | Artifact |
130
+ |----------|-----------------|----------|
131
+ | **maestro-plan** | `$maestro-plan "N --dir .workflow/scratch/{collab-session}/"` | `context.md` + `conclusions.json` |
132
+ | **maestro-analyze** | auto via `state.json.artifacts[]` (type=collab) | `context.md` as prior context |
133
+ | **maestro-brainstorm** | auto via `state.json.artifacts[]` (type=collab) | `context.md` as supplementary context |
134
+ | **maestro-ralph** | auto — lifecycle position inference includes collab | artifact chain lookup |
135
+
136
+ `context.md` uses the standard Locked/Free/Deferred decision format. `conclusions.json` follows the same schema as maestro-analyze's output. This allows plan to skip wave 1 exploration when collab has already produced structured conclusions.
137
+ </csv_schema>
138
+
139
+ <invariants>
140
+ 1. **Plan Before Execute**: Present collaboration plan with tool selection for user approval before any CLI invocation
141
+ 2. **Wave Order is Sacred**: Never execute wave 2 before wave 1 completes
142
+ 3. **CSV is Source of Truth**: Master tasks.csv holds all state
143
+ 4. **Context Propagation**: prev_context built from master CSV, not from memory
144
+ 5. **Discovery Board is Append-Only**: Never modify or delete discoveries.ndjson
145
+ 6. **Same Prompt, Different Tool**: Wave 1 agents all use the same base prompt, only --to differs
146
+ 7. **Minimum 2 Tools**: Collaboration requires at least 2 CLI tools; abort if fewer enabled
147
+ 8. **Delegate Protocol**: All exec_command calls follow delegate-protocol.codex.md (yield_time + poll)
148
+ 9. **DO NOT STOP**: Continuous execution until all waves complete
149
+ 10. **Partial Degradation**: If 1+ tool fails in wave 1, continue with available results
150
+ </invariants>
151
+
152
+ <execution>
153
+
154
+ ### Session Initialization
155
+
156
+ **Parse from `$ARGUMENTS`**:
157
+
158
+ | Variable | Source | Default |
159
+ |----------|--------|---------|
160
+ | `AUTO_YES` | `--yes` or `-y` | false |
161
+ | `maxConcurrency` | `--concurrency N` or `-c N` | 5 |
162
+ | `selectedTools` | `--tools <list>` | auto-select |
163
+ | `delegateMode` | `--mode` | `analysis` |
164
+ | `ruleTemplate` | `--rule` | null |
165
+ | `requirement` | remaining text after flag removal | "" (E001 if empty) |
166
+
167
+ **Auto-bootstrap**: If `.workflow/` missing, create minimal structure.
168
+
169
+ **Session paths** (UTC+8 date prefix):
170
+ - `slug` ← requirement kebab-cased, max 40 chars
171
+ - `sessionFolder`: `.workflow/.csv-wave/{YYYYMMDD}-collab-{slug}/`
172
+
173
+ - `scratchDir`: `.workflow/scratch/{YYYYMMDD}-collab-{slug}/`
174
+
175
+ Create `sessionFolder` + `sessionFolder/per-tool/` + `scratchDir`.
176
+
177
+ ### Phase 1: Requirement Resolution -> CSV
178
+
179
+ **Objective**: Parse requirement, discover available tools, present plan for user approval, generate tasks.csv.
180
+
181
+ **1. Discover available CLI tools**:
182
+
183
+ Read `~/.maestro/cli-tools.json` → extract all tool entries. Build `availableTools[]`:
184
+
185
+ ```
186
+ For each tool in config.tools:
187
+ availableTools.push({
188
+ name: tool.name,
189
+ enabled: tool.enabled,
190
+ type: tool.type, // builtin | cli-wrapper | api-endpoint
191
+ model: tool.primaryModel,
192
+ tags: tool.tags, // [fullstack, frontend, backend, ...]
193
+ eligible: tool.enabled
194
+ && (delegateMode != "write" || tool.type != "api-endpoint")
195
+ })
196
+ ```
197
+
198
+ Validate: at least 2 eligible tools required (E002 if fewer).
199
+
200
+ **2. Auto-recommend tool selection**:
201
+
202
+ | Source | Logic |
203
+ |--------|-------|
204
+ | `--tools` explicit | Use provided list, validate each is eligible |
205
+ | No `--tools` | Take first 3 eligible tools in config order |
206
+
207
+ Mark each eligible tool as `recommended: true/false` based on auto-selection.
208
+
209
+ **3. Context loading**:
210
+ - Read `.workflow/project.md` if exists
211
+ - Load project specs: `maestro spec load --category arch,coding` (if available)
212
+ - Grep for relevant codebase files based on requirement keywords
213
+
214
+ **4. Build delegate prompt** (shared across all tools):
215
+
216
+ ```
217
+ PURPOSE: {requirement}; success = actionable findings with evidence
218
+ TASK: {auto-decomposed from requirement into 3-5 specific verbs}
219
+ MODE: {delegateMode}
220
+ CONTEXT: @**/* | Memory: {project context if available}
221
+ EXPECTED: Structured findings with file:line references, confidence score (0-100), prioritized recommendations
222
+ CONSTRAINTS: {from requirement} | Output findings as structured text with sections: ## Findings, ## Recommendations, ## Confidence
223
+ ```
224
+
225
+ **5. Present Collaboration Plan** (skip if AUTO_YES):
226
+
227
+ Display plan summary, then `request_user_input` for approval:
228
+
229
+ ```
230
+ ============================================================
231
+ COLLABORATION PLAN
232
+ ============================================================
233
+ Requirement: {requirement}
234
+ Mode: {delegateMode}
235
+ Rule: {ruleTemplate || "none"}
236
+
237
+ Available CLI Tools (from cli-tools.json):
238
+ [✓] gemini — gemini-3.1-pro-preview [fullstack, frontend]
239
+ [✓] claude — claude-sonnet-4-6 [fullstack]
240
+ [✓] codex — gpt-5.5 [fullstack, backend]
241
+ [ ] opencode — (no model) [fullstack]
242
+
243
+ Selected: gemini, claude, codex (3 tools)
244
+
245
+ Pipeline:
246
+ Wave 1: Fan-out → gemini + claude + codex (parallel)
247
+ Wave 2: Cross-verification (conflicts/consensus)
248
+ Wave 3: Synthesis → context.md + conclusions.json
249
+
250
+ Prompt Preview:
251
+ PURPOSE: {first 80 chars}...
252
+ TASK: {task verbs}
253
+ ============================================================
254
+ ```
255
+
256
+ ```json
257
+ request_user_input({
258
+ "questions": [{
259
+ "id": "collab_plan",
260
+ "header": "Collaboration Plan",
261
+ "question": "以上为协作计划。如何继续?",
262
+ "options": [
263
+ {
264
+ "label": "执行 (Recommended)",
265
+ "description": "使用选中的 {N} 个 CLI 工具开始协作分析"
266
+ },
267
+ {
268
+ "label": "修改工具选择",
269
+ "description": "更改参与协作的 CLI 工具组合"
270
+ },
271
+ {
272
+ "label": "取消",
273
+ "description": "中止协作,不执行任何调用"
274
+ }
275
+ ]
276
+ }]
277
+ })
278
+ ```
279
+
280
+ **Handle user response**:
281
+
282
+ | Response | Action |
283
+ |----------|--------|
284
+ | **执行** | Proceed to step 6 (CSV generation) |
285
+ | **修改工具选择** | → Tool Modification Interaction (step 5a) |
286
+ | **取消** | Abort with message "协作已取消" |
287
+
288
+ #### 5a. Tool Modification Interaction
289
+
290
+ Present all eligible tools as toggleable options:
291
+
292
+ ```json
293
+ request_user_input({
294
+ "questions": [{
295
+ "id": "tool_selection",
296
+ "header": "CLI Tool Selection",
297
+ "question": "选择参与协作的 CLI 工具(至少 2 个):",
298
+ "options": [
299
+ { "label": "gemini", "description": "gemini-3.1-pro-preview — fullstack, frontend" },
300
+ { "label": "claude", "description": "claude-sonnet-4-6 — fullstack" },
301
+ { "label": "codex", "description": "gpt-5.5 — fullstack, backend" },
302
+ { "label": "opencode", "description": "(no model) — fullstack" }
303
+ ]
304
+ }]
305
+ })
306
+ ```
307
+
308
+ Options are **dynamically built** from `availableTools.filter(t => t.eligible)`:
309
+ - `label` = tool name
310
+ - `description` = `{primaryModel} — {tags.join(", ")}`
311
+
312
+ Parse user selection → update `selectedTools`. Validate minimum 2 (re-prompt if fewer).
313
+ Return to step 5 to re-display updated plan.
314
+
315
+ **6. CSV generation**:
316
+ - N tool rows (wave 1, one per selected tool)
317
+ - 1 cross-verify row (wave 2, deps on all wave 1)
318
+ - 1 synthesis row (wave 3, deps on wave 2)
319
+
320
+ ### Phase 2: Wave Execution Engine
321
+
322
+ #### Wave 1: CLI Fan-Out (Parallel)
323
+
324
+ Filter `wave == 1 && status == pending` from master CSV. Write `wave-1.csv`.
325
+
326
+ Each wave 1 agent:
327
+
328
+ 1. Read task row: extract `tool`, `prompt`, `mode`, `rule`
329
+ 2. Execute delegate (blocking):
330
+
331
+ ```
332
+ exec_command({
333
+ cmd: `maestro delegate "${prompt}" --to ${tool} --mode ${mode} ${rule ? '--rule ' + rule : ''}`,
334
+ yield_time_ms: 30000,
335
+ max_output_tokens: 6000
336
+ })
337
+ // If session_id returned -> poll write_stdin until completion
338
+ // See @~/.maestro/workflows/delegate-protocol.codex.md
339
+ ```
340
+
341
+ 3. Parse delegate output
342
+ 4. Write per-tool output to `per-tool/{tool}-output.md`
343
+ 5. Share findings via discovery board
344
+
345
+ ```javascript
346
+ spawn_agents_on_csv({
347
+ csv_path: `${sessionFolder}/wave-1.csv`,
348
+ id_column: "id",
349
+ instruction: buildFanOutInstruction(sessionFolder),
350
+ max_concurrency: maxConcurrency,
351
+ max_runtime_seconds: 3600,
352
+ output_csv_path: `${sessionFolder}/wave-1-results.csv`,
353
+ output_schema: { id, status: ["completed"|"failed"], findings, recommendations, confidence, error }
354
+ })
355
+ ```
356
+
357
+ Merge results into master `tasks.csv`, delete `wave-1.csv`.
358
+
359
+ **Fan-Out Agent Instruction**:
360
+
361
+ ```
362
+ You are a CLI collaboration agent. Your task is to delegate analysis to a specific CLI tool and capture its output.
363
+
364
+ 1. Read your task row for: tool, prompt, mode, rule
365
+ 2. Execute the delegate call using exec_command (follow delegate-protocol.codex.md):
366
+ exec_command({
367
+ cmd: `maestro delegate "<prompt>" --to <tool> --mode <mode> [--rule <rule>]`,
368
+ yield_time_ms: 30000, max_output_tokens: 6000
369
+ })
370
+ 3. If session_id returned, poll via write_stdin until completion
371
+ 4. Write full output to {sessionFolder}/per-tool/{tool}-output.md
372
+ 5. Extract: findings (key points), recommendations (actionable items), confidence (0-100)
373
+ 6. Share via discoveries.ndjson: type="cli_finding", data={tool, dimension, finding, confidence}
374
+ 7. Report result with findings, recommendations, confidence
375
+ ```
376
+
377
+ #### Wave 2: Cross-Verification (Single Agent)
378
+
379
+ Filter `wave == 2 && status == pending`. Build `prev_context` from wave 1 findings.
380
+
381
+ ```javascript
382
+ spawn_agents_on_csv({
383
+ csv_path: `${sessionFolder}/wave-2.csv`,
384
+ id_column: "id",
385
+ instruction: buildCrossVerifyInstruction(sessionFolder),
386
+ max_concurrency: 1,
387
+ max_runtime_seconds: 3600,
388
+ output_csv_path: `${sessionFolder}/wave-2-results.csv`,
389
+ output_schema: { id, status: ["completed"|"failed"], findings, recommendations, confidence, error }
390
+ })
391
+ ```
392
+
393
+ **Cross-Verify Agent Instruction**:
394
+
395
+ ```
396
+ You are a cross-verification agent. Compare outputs from multiple CLI tools.
397
+
398
+ 1. Read all per-tool outputs from {sessionFolder}/per-tool/
399
+ 2. Read discoveries.ndjson for shared findings
400
+ 3. For each finding across tools, classify:
401
+ - [CONSENSUS]: 2+ tools agree on same finding/recommendation
402
+ - [CONFLICT]: Tools disagree on approach/assessment
403
+ - [UNIQUE]: Finding from only one tool (may be valuable or noise)
404
+ 4. For [CONFLICT] items: note each tool's position and evidence strength
405
+ 5. Compute consensus_level: (consensus_count / total_findings) * 100
406
+ 6. Write findings as structured text:
407
+ ## Consensus Areas
408
+ ## Conflicts (with per-tool positions)
409
+ ## Unique Findings (with source tool)
410
+ ## Consensus Level: {N}%
411
+ ```
412
+
413
+ Merge results into master `tasks.csv`, delete `wave-2.csv`.
414
+
415
+ #### Wave 3: Synthesis (Single Agent)
416
+
417
+ Filter `wave == 3 && status == pending`. Build `prev_context` from wave 2 findings.
418
+
419
+ ```javascript
420
+ spawn_agents_on_csv({
421
+ csv_path: `${sessionFolder}/wave-3.csv`,
422
+ id_column: "id",
423
+ instruction: buildSynthesisInstruction(sessionFolder),
424
+ max_concurrency: 1,
425
+ max_runtime_seconds: 3600,
426
+ output_csv_path: `${sessionFolder}/wave-3-results.csv`,
427
+ output_schema: { id, status: ["completed"|"failed"], findings, recommendations, confidence, error }
428
+ })
429
+ ```
430
+
431
+ **Synthesis Agent Instruction**:
432
+
433
+ ```
434
+ You are a synthesis agent. Merge cross-verified findings into a final report.
435
+
436
+ 1. Read cross-verification results from prev_context
437
+ 2. Read all per-tool outputs from {sessionFolder}/per-tool/
438
+ 3. Read discoveries.ndjson
439
+ 4. Resolve [CONFLICT] items via evidence-weighted voting:
440
+ - Higher confidence tool's position wins
441
+ - More specific evidence (file:line refs) wins over general statements
442
+ - If tied: present both with [SUGGESTED] tag
443
+ 5. Generate collab-report.md:
444
+
445
+ # Multi-CLI Collaboration Report -- {requirement}
446
+
447
+ ## Summary
448
+ - Tools: {tool_list}
449
+ - Consensus level: {N}%
450
+ - Key finding: {top finding}
451
+
452
+ ## Consensus Findings
453
+ {merged findings agreed by 2+ tools}
454
+
455
+ ## Resolved Conflicts
456
+ {conflicts resolved with rationale}
457
+
458
+ ## Unresolved Items
459
+ {items requiring human judgment}
460
+
461
+ ## Unique Insights
462
+ {valuable unique findings with source attribution}
463
+
464
+ ## Recommendations
465
+ {prioritized, merged recommendations}
466
+
467
+ ## Per-Tool Confidence
468
+ | Tool | Confidence | Key Strength |
469
+ |------|-----------|--------------|
470
+
471
+ 6. Generate context.md (standard downstream format):
472
+
473
+ # Context: {requirement}
474
+
475
+ **Date**: {date}
476
+ **Mode**: collab ({tool_list})
477
+ **Consensus Level**: {N}%
478
+
479
+ ## Decisions
480
+
481
+ ### Decision N: {TITLE}
482
+ - **Context**: {what and why}
483
+ - **Options**: 1. {opt1} 2. {opt2}
484
+ - **Chosen**: {selected — from consensus or evidence-weighted resolution}
485
+ - **Reason**: {rationale — include which tools agreed/disagreed}
486
+
487
+ ## Constraints
488
+
489
+ ### Locked
490
+ {[CONSENSUS] items — agreed by 2+ tools, treat as confirmed decisions}
491
+
492
+ ### Free
493
+ {[UNIQUE] items with strong evidence — implementer may adopt or skip}
494
+
495
+ ### Deferred
496
+ {[UNRESOLVED] conflicts — require human judgment before proceeding}
497
+
498
+ ## Code Context
499
+ {file:line references from per-tool findings}
500
+
501
+ 7. Generate conclusions.json (plan fast-track compatible):
502
+
503
+ {
504
+ "session_id": "<session>",
505
+ "subject": "<requirement>",
506
+ "mode": "collab",
507
+ "tools": ["gemini", "qwen", "claude"],
508
+ "consensus_level": 85,
509
+ "recommendation": "Go|No-Go|Conditional",
510
+ "confidence": "high|medium|low",
511
+ "dimensions": [
512
+ { "name": "<tool>", "score": 80, "findings": "...", "recommendations": "..." }
513
+ ],
514
+ "decisions": [
515
+ { "title": "...", "classification": "locked|free|deferred", "source_tools": ["gemini","qwen"], "rationale": "..." }
516
+ ],
517
+ "timestamp": "<ISO>"
518
+ }
519
+
520
+ 8. Write collab-report.md, context.md, conclusions.json to {sessionFolder}/
521
+ ```
522
+
523
+ Merge results into master `tasks.csv`, delete `wave-3.csv`.
524
+
525
+ ### Phase 3: Results Aggregation
526
+
527
+ 1. Export final `tasks.csv` as `results.csv`
528
+ 2. Verify `collab-report.md` + `context.md` + `conclusions.json` exist (if synthesis failed, build minimal versions from available findings)
529
+ 3. Copy final outputs to `scratchDir`:
530
+ - `collab-report.md` → `{scratchDir}/collab-report.md`
531
+ - `context.md` → `{scratchDir}/context.md`
532
+ - `conclusions.json` → `{scratchDir}/conclusions.json`
533
+
534
+ 4. **Register artifact in state.json**:
535
+ ```json
536
+ {
537
+ "id": "CLB-{next_id}",
538
+ "type": "collab",
539
+ "milestone": "{current_milestone}",
540
+ "phase": null,
541
+ "scope": "adhoc",
542
+ "path": "scratch/{YYYYMMDD}-collab-{slug}",
543
+ "status": "completed",
544
+ "depends_on": null,
545
+ "harvested": false,
546
+ "created_at": "<ISO>",
547
+ "completed_at": "<ISO>"
548
+ }
549
+ ```
550
+
551
+ 5. **Spec Enrichment**: For each Locked decision in context.md:
552
+ - `maestro spec add arch "<decision.title>" "<decision.rationale>" --keywords ... --source collab:{sessionId}`
553
+
554
+ 6. Display summary:
555
+
556
+ ```
557
+ ============================================================
558
+ MULTI-CLI COLLABORATION COMPLETE
559
+ ============================================================
560
+ Requirement: {requirement}
561
+ Tools: {tool_list}
562
+ Consensus Level: {N}%
563
+ Wave Results: {completed}/{total} tasks
564
+
565
+ Per-Tool:
566
+ gemini: {status} (confidence: {N}%)
567
+ qwen: {status} (confidence: {N}%)
568
+ claude: {status} (confidence: {N}%)
569
+
570
+ Artifact: CLB-{id} registered in state.json
571
+ Output: {scratchDir}/
572
+
573
+ Next steps:
574
+ $maestro-analyze "{topic}" -- Deep feasibility analysis
575
+ $maestro-plan "{phase} --dir {scratchDir}" -- Plan from collab conclusions
576
+ $maestro-brainstorm "{topic}" -- Expand with multi-role brainstorm
577
+ ============================================================
578
+ ```
579
+
580
+ ### Shared Discovery Board Protocol
581
+
582
+ #### Domain Discovery Types
583
+
584
+ | Type | Dedup Key | Data Schema | Description |
585
+ |------|-----------|-------------|-------------|
586
+ | `cli_finding` | `data.tool+data.dimension` | `{tool, dimension, finding, confidence, evidence}` | Per-tool finding |
587
+ | `consensus` | `data.area` | `{area, tools[], finding, confidence}` | Cross-tool agreement |
588
+ | `conflict` | `data.area` | `{area, positions[{tool, stance, evidence}], resolution}` | Cross-tool disagreement |
589
+ | `unique_insight` | `data.tool+data.finding` | `{tool, finding, significance, actionable}` | Single-tool unique finding |
590
+
591
+ #### Protocol
592
+
593
+ Read `discoveries.ndjson` before analysis. Append-only: dedup by type+key, never modify/delete.
594
+
595
+ </execution>
596
+
597
+ <error_codes>
598
+
599
+ | Code | Severity | Description | Recovery |
600
+ |------|----------|-------------|----------|
601
+ | E001 | error | Requirement argument missing | Prompt for requirement |
602
+ | E002 | error | Fewer than 2 CLI tools available | Check cli-tools.json, enable more tools |
603
+ | E003 | error | Specified tool not found/enabled | Show available tools |
604
+ | E004 | error | All wave 1 delegates failed | Abort with per-tool error details |
605
+ | W001 | warning | One tool failed in wave 1 | Continue with remaining tools |
606
+ | W002 | warning | Cross-verify found >50% conflicts | Highlight in report, recommend manual review |
607
+ | W003 | warning | Synthesis agent failed | Use cross-verify output as fallback report |
608
+ | W004 | warning | Low consensus level (<40%) | Flag in summary, tools may need different prompts |
609
+
610
+ </error_codes>
611
+
612
+ <success_criteria>
613
+ - [ ] Session folder created with valid tasks.csv
614
+ - [ ] Available CLI tools discovered from cli-tools.json with eligibility filtering
615
+ - [ ] Collaboration plan presented via request_user_input (tool list, pipeline, prompt preview)
616
+ - [ ] User approved or modified tool selection before execution
617
+ - [ ] CLI tools finalized (auto or user-modified) with minimum 2
618
+ - [ ] All wave 1 delegates executed via delegate-protocol.codex.md (blocking poll)
619
+ - [ ] Per-tool outputs written to per-tool/{tool}-output.md
620
+ - [ ] Cross-verification completed with consensus/conflict/unique classification
621
+ - [ ] Synthesis produced collab-report.md with merged findings
622
+ - [ ] context.md produced in standard Locked/Free/Deferred format (downstream compatible)
623
+ - [ ] conclusions.json produced with per-tool dimensions and decision trail (plan fast-track compatible)
624
+ - [ ] Consensus level computed and displayed
625
+ - [ ] Results.csv exported with all task statuses
626
+ - [ ] CLB artifact registered in state.json
627
+ - [ ] Final outputs copied to scratchDir (collab-report.md, context.md, conclusions.json)
628
+ - [ ] Spec enrichment applied for Locked decisions
629
+ - [ ] discoveries.ndjson append-only throughout
630
+ - [ ] Partial degradation: continue if 1+ tools succeed in wave 1
631
+ </success_criteria>
@@ -153,6 +153,13 @@ Load session state by explicit ID or most recent `MCP-*/state.json` with `status
153
153
  4. Group into waves: barrier nodes → solo wave, non-barrier nodes → accumulate into parallel wave
154
154
  5. Build steps array from waves, write `state.json`
155
155
 
156
+ **Step 2.5a — Register goal constraint**:
157
+ ```
158
+ functions.create_goal({
159
+ objective: `Player ${template_name}: ${steps.length} steps from template ${template_id}`
160
+ })
161
+ ```
162
+
156
163
  **Step 2.6** — Display start banner:
157
164
  ```
158
165
  ============================================================
@@ -272,6 +279,9 @@ const RESULT_SCHEMA = {
272
279
  ```
273
280
 
274
281
  Update `state.status = "completed"`, write final `state.json`.
282
+ Release goal constraint: `functions.update_goal({ status: "complete" })`
283
+
284
+ **Note**: Abort path (Phase 3 step 3g) does NOT call `update_goal` — goal stays running for `-c` resume.
275
285
  </execution>
276
286
 
277
287
  <csv_schema>
@@ -279,9 +279,15 @@ Write `.workflow/.maestro/ralph-{YYYYMMDD-HHmmss}/status.json`:
279
279
  }
280
280
  ```
281
281
 
282
- ### 1.7: Initialize plan + confirm
282
+ ### 1.7: Initialize tracking + confirm
283
283
 
284
284
  ```
285
+ // Goal = outer constraint — ensures entire lifecycle chain completes
286
+ functions.create_goal({
287
+ objective: `Ralph lifecycle: ${lifecycle_position} → milestone-complete | ${steps.length} steps (${decision_count} decisions) | quality=${quality_mode}`
288
+ })
289
+
290
+ // Plan = inner tracking — sub-step progress
285
291
  functions.update_plan({
286
292
  explanation: "Ralph lifecycle: {position} → milestone-complete",
287
293
  plan: steps.map(step => ({ step: stepLabel(step), status: "pending" }))
@@ -587,8 +593,13 @@ functions.update_plan({
587
593
  explanation: "Ralph lifecycle complete",
588
594
  plan: steps.map(step => ({ step: stepLabel(step), status: "completed" }))
589
595
  })
596
+
597
+ // Release goal constraint — only on true completion
598
+ functions.update_goal({ status: "complete" })
590
599
  ```
591
600
 
601
+ **Note**: Pause/escalate paths (`post-debug-escalate` STOP, session pause) do NOT call `update_goal` — goal stays running for resume.
602
+
592
603
  Display:
593
604
  ```
594
605
  ============================================================
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "maestro-flow",
3
- "version": "0.3.40",
3
+ "version": "0.3.41",
4
4
  "description": "Workflow orchestration CLI with MCP endpoint support and extensible architecture",
5
5
  "type": "module",
6
6
  "imports": {