maestro-flow 0.3.40 → 0.3.42

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,333 @@
1
+ ---
2
+ name: maestro-collab
3
+ description: Multi-CLI collaborative analysis -- fan-out to multiple CLI tools, cross-verify, synthesize
4
+ argument-hint: "\"<requirement>\" [--tools gemini,qwen,claude] [--mode analysis|write] [--rule <template>] [-y]"
5
+ allowed-tools:
6
+ - Read
7
+ - Write
8
+ - Bash
9
+ - Glob
10
+ - Grep
11
+ - Agent
12
+ - AskUserQuestion
13
+ ---
14
+
15
+ <purpose>
16
+ Multi-CLI collaboration: fan-out the same requirement to multiple CLI tools in parallel, cross-verify outputs for consensus/conflicts, then synthesize into a unified report with standard downstream artifacts (context.md + conclusions.json).
17
+
18
+ Each CLI tool independently analyzes the requirement. Results are compared and merged via evidence-weighted synthesis.
19
+ </purpose>
20
+
21
+ <context>
22
+ $ARGUMENTS — requirement text and optional flags.
23
+
24
+ ```bash
25
+ /maestro-collab "analyze the auth module for security vulnerabilities"
26
+ /maestro-collab "design a caching strategy" --tools gemini,qwen,claude
27
+ /maestro-collab -y "review error handling patterns"
28
+ /maestro-collab "refactor user service" --mode write --tools gemini,claude
29
+ ```
30
+
31
+ **Flags**:
32
+ - `--tools <list>`: Comma-separated CLI tools (default: auto-select first 3 enabled)
33
+ - `--mode analysis|write`: Delegate mode (default: analysis)
34
+ - `--rule <template>`: Shared rule template for all delegates
35
+ - `-y` / `--yes`: Skip plan confirmation
36
+
37
+ **Output**: `.workflow/scratch/{YYYYMMDD}-collab-{slug}/`
38
+ - `collab-report.md` — full collaboration report
39
+ - `context.md` — standard Locked/Free/Deferred decisions (plan/analyze compatible)
40
+ - `conclusions.json` — structured conclusions (plan fast-track compatible)
41
+ - `per-tool/{tool}-output.md` — raw per-tool outputs
42
+ </context>
43
+
44
+ <execution>
45
+
46
+ ### Step 1: Parse Arguments
47
+
48
+ Extract from `$ARGUMENTS`:
49
+ - `requirement` — remaining text after flag removal (error if empty)
50
+ - `--tools` → `selectedTools` (comma-split)
51
+ - `--mode` → `delegateMode` (default: `analysis`)
52
+ - `--rule` → `ruleTemplate`
53
+ - `-y` / `--yes` → `autoYes`
54
+
55
+ ### Step 2: Discover Available CLI Tools
56
+
57
+ ```bash
58
+ Bash("maestro tools list --json 2>/dev/null || cat ~/.maestro/cli-tools.json")
59
+ ```
60
+
61
+ Parse tool entries. Build eligible list:
62
+ - `enabled == true`
63
+ - If `--mode write`: exclude `type == "api-endpoint"`
64
+
65
+ Auto-select (when `--tools` omitted): first 3 eligible in config order.
66
+ Validate: minimum 2 eligible tools (abort if fewer).
67
+
68
+ ### Step 3: Present Collaboration Plan
69
+
70
+ **(Skip if `-y`)**
71
+
72
+ Display plan, then ask user:
73
+
74
+ ```
75
+ ============================================================
76
+ COLLABORATION PLAN
77
+ ============================================================
78
+ Requirement: {requirement}
79
+ Mode: {delegateMode}
80
+ Rule: {ruleTemplate || "none"}
81
+
82
+ Available CLI Tools (from cli-tools.json):
83
+ [✓] gemini — gemini-3.1-pro-preview [fullstack, frontend]
84
+ [✓] claude — claude-sonnet-4-6 [fullstack]
85
+ [✓] codex — gpt-5.5 [fullstack, backend]
86
+ [ ] opencode — (no model) [fullstack]
87
+
88
+ Selected: gemini, claude, codex (3 tools)
89
+
90
+ Pipeline:
91
+ 1. Fan-out → parallel delegate to each tool
92
+ 2. Cross-verification → consensus/conflict analysis
93
+ 3. Synthesis → context.md + conclusions.json
94
+ ============================================================
95
+ ```
96
+
97
+ Use `AskUserQuestion` with options:
98
+ - **执行** — proceed with selected tools
99
+ - **修改工具选择** — let user specify different tool combination
100
+ - **取消** — abort
101
+
102
+ If **修改工具选择**: ask user which tools to use (show eligible list), validate ≥ 2, re-display plan.
103
+
104
+ ### Step 4: Setup Session
105
+
106
+ ```
107
+ slug = requirement kebab-cased, max 40 chars
108
+ outputDir = .workflow/scratch/{YYYYMMDD}-collab-{slug}/
109
+ ```
110
+
111
+ Create `outputDir` + `outputDir/per-tool/`.
112
+
113
+ ### Step 5: Build Delegate Prompt
114
+
115
+ Shared prompt for all tools:
116
+
117
+ ```
118
+ PURPOSE: {requirement}; success = actionable findings with evidence
119
+ TASK: {auto-decomposed into 3-5 specific verbs}
120
+ MODE: {delegateMode}
121
+ CONTEXT: @**/*
122
+ EXPECTED: Structured findings with file:line references, confidence score (0-100), prioritized recommendations. Sections: ## Findings, ## Recommendations, ## Confidence
123
+ CONSTRAINTS: {extracted from requirement}
124
+ ```
125
+
126
+ ### Step 6: Parallel Fan-Out
127
+
128
+ Launch ALL delegate calls simultaneously using multiple `Bash(run_in_background: true)` in a **single message**:
129
+
130
+ ```
131
+ // Launch all in ONE message — do NOT wait between calls
132
+ Bash({
133
+ command: `maestro delegate "${prompt}" --to gemini --mode ${mode} ${rule}`,
134
+ run_in_background: true
135
+ })
136
+ Bash({
137
+ command: `maestro delegate "${prompt}" --to claude --mode ${mode} ${rule}`,
138
+ run_in_background: true
139
+ })
140
+ Bash({
141
+ command: `maestro delegate "${prompt}" --to codex --mode ${mode} ${rule}`,
142
+ run_in_background: true
143
+ })
144
+ ```
145
+
146
+ **After launching all calls → STOP immediately. Do not output anything. Wait for background completion callbacks.**
147
+
148
+ ### Step 7: Collect Results
149
+
150
+ As each background callback arrives:
151
+ 1. Extract exec ID from output (`[MAESTRO_EXEC_ID=...]`)
152
+ 2. Run `maestro delegate output <id>` to get full result
153
+ 3. Write raw output to `per-tool/{tool}-output.md`
154
+
155
+ **Wait until ALL callbacks have arrived before proceeding.**
156
+
157
+ ### Step 8: Cross-Verify
158
+
159
+ Read all `per-tool/{tool}-output.md` files. Compare findings across tools:
160
+
161
+ For each finding, classify:
162
+ - **[CONSENSUS]**: 2+ tools agree on same finding/recommendation
163
+ - **[CONFLICT]**: Tools disagree on approach or assessment
164
+ - **[UNIQUE]**: Finding from only one tool
165
+
166
+ Compute `consensus_level = (consensus_count / total_findings) * 100`.
167
+
168
+ ### Step 9: Synthesize Outputs
169
+
170
+ Resolve conflicts via evidence-weighted voting:
171
+ - Higher confidence tool's position wins
172
+ - More specific evidence (file:line refs) wins over general statements
173
+ - If tied: mark as `[SUGGESTED]`
174
+
175
+ Generate three output files:
176
+
177
+ #### collab-report.md
178
+
179
+ ```markdown
180
+ # Multi-CLI Collaboration Report — {requirement}
181
+
182
+ ## Summary
183
+ - Tools: {tool_list}
184
+ - Consensus level: {N}%
185
+ - Key finding: {top finding}
186
+
187
+ ## Consensus Findings
188
+ {findings agreed by 2+ tools}
189
+
190
+ ## Resolved Conflicts
191
+ {conflicts resolved with rationale and winning tool}
192
+
193
+ ## Unresolved Items
194
+ {items requiring human judgment}
195
+
196
+ ## Unique Insights
197
+ {valuable unique findings with source tool attribution}
198
+
199
+ ## Recommendations
200
+ {prioritized, merged recommendations}
201
+
202
+ ## Per-Tool Confidence
203
+ | Tool | Confidence | Key Strength |
204
+ |------|-----------|--------------|
205
+ ```
206
+
207
+ #### context.md (standard downstream format)
208
+
209
+ ```markdown
210
+ # Context: {requirement}
211
+
212
+ **Date**: {date}
213
+ **Mode**: collab ({tool_list})
214
+ **Consensus Level**: {N}%
215
+
216
+ ## Decisions
217
+
218
+ ### Decision N: {TITLE}
219
+ - **Context**: {what and why}
220
+ - **Options**: 1. {opt1} 2. {opt2}
221
+ - **Chosen**: {selected}
222
+ - **Reason**: {rationale — which tools agreed/disagreed}
223
+
224
+ ## Constraints
225
+
226
+ ### Locked
227
+ {[CONSENSUS] items — treat as confirmed decisions}
228
+
229
+ ### Free
230
+ {[UNIQUE] items with strong evidence — implementer discretion}
231
+
232
+ ### Deferred
233
+ {[UNRESOLVED] conflicts — require human judgment}
234
+
235
+ ## Code Context
236
+ {file:line references from per-tool findings}
237
+ ```
238
+
239
+ #### conclusions.json
240
+
241
+ ```json
242
+ {
243
+ "session_id": "{sessionId}",
244
+ "subject": "{requirement}",
245
+ "mode": "collab",
246
+ "tools": ["gemini", "claude", "codex"],
247
+ "consensus_level": 85,
248
+ "recommendation": "Go|No-Go|Conditional",
249
+ "confidence": "high|medium|low",
250
+ "dimensions": [
251
+ { "name": "gemini", "score": 80, "findings": "...", "recommendations": "..." }
252
+ ],
253
+ "decisions": [
254
+ { "title": "...", "classification": "locked|free|deferred", "source_tools": [], "rationale": "..." }
255
+ ],
256
+ "timestamp": "<ISO>"
257
+ }
258
+ ```
259
+
260
+ ### Step 10: Register Artifact
261
+
262
+ Append to `.workflow/state.json`:
263
+
264
+ ```json
265
+ {
266
+ "id": "CLB-{next_id}",
267
+ "type": "collab",
268
+ "milestone": "{current_milestone}",
269
+ "phase": null,
270
+ "scope": "adhoc",
271
+ "path": "scratch/{YYYYMMDD}-collab-{slug}",
272
+ "status": "completed",
273
+ "depends_on": null,
274
+ "harvested": false,
275
+ "created_at": "<ISO>",
276
+ "completed_at": "<ISO>"
277
+ }
278
+ ```
279
+
280
+ ### Step 11: Display Summary
281
+
282
+ ```
283
+ ============================================================
284
+ MULTI-CLI COLLABORATION COMPLETE
285
+ ============================================================
286
+ Requirement: {requirement}
287
+ Tools: {tool_list}
288
+ Consensus Level: {N}%
289
+
290
+ Per-Tool:
291
+ gemini: completed (confidence: {N}%)
292
+ claude: completed (confidence: {N}%)
293
+ codex: completed (confidence: {N}%)
294
+
295
+ Artifact: CLB-{id}
296
+ Output: {outputDir}/
297
+
298
+ Next steps:
299
+ /maestro-analyze "{topic}" — Deep feasibility analysis
300
+ /maestro-plan "{phase} --dir {dir}" — Plan from collab conclusions
301
+ /maestro-brainstorm "{topic}" — Expand with multi-role brainstorm
302
+ ============================================================
303
+ ```
304
+
305
+ </execution>
306
+
307
+ <error_codes>
308
+
309
+ | Code | Severity | Condition | Recovery |
310
+ |------|----------|-----------|----------|
311
+ | E001 | error | Requirement argument missing | Prompt for requirement |
312
+ | E002 | error | Fewer than 2 CLI tools eligible | Check cli-tools.json, enable more tools |
313
+ | E003 | error | Specified tool not found/enabled | Show available tools |
314
+ | E004 | error | All delegates failed | Abort with per-tool error details |
315
+ | W001 | warning | One tool failed | Continue with remaining tools |
316
+ | W002 | warning | >50% conflicts in cross-verify | Highlight in report, recommend manual review |
317
+ | W003 | warning | Low consensus level (<40%) | Flag in summary |
318
+
319
+ </error_codes>
320
+
321
+ <success_criteria>
322
+ - [ ] Available tools discovered from cli-tools.json with eligibility filtering
323
+ - [ ] Plan presented via AskUserQuestion with tool modification option (unless -y)
324
+ - [ ] All delegates launched in parallel via Bash(run_in_background: true)
325
+ - [ ] Execution stopped after launch — waited for all callbacks
326
+ - [ ] Per-tool outputs written to per-tool/{tool}-output.md
327
+ - [ ] Cross-verification: consensus/conflict/unique classification complete
328
+ - [ ] collab-report.md produced with merged findings
329
+ - [ ] context.md produced in Locked/Free/Deferred format (downstream compatible)
330
+ - [ ] conclusions.json produced (plan fast-track compatible)
331
+ - [ ] CLB artifact registered in state.json
332
+ - [ ] Partial degradation: continued if 1+ tools succeeded
333
+ </success_criteria>
@@ -164,6 +164,10 @@ Never "simulate" or "inline" a skill's work. If Skill() is not called, the step
164
164
  | quality-test | `-y --auto-fix` |
165
165
  | quality-retrospective | `-y` |
166
166
  | maestro-milestone-complete | `-y` |
167
+ | maestro-verify | `-y` |
168
+ | quality-review | `-y` |
169
+ | quality-debug | `-y` |
170
+ | maestro-milestone-audit | `-y` |
167
171
 
168
172
  ```
169
173
  flag = auto_flag_map[next.skill] || ""
@@ -182,9 +186,15 @@ Context-isolated skill execution via new Claude Code session.
182
186
  HARD RULE: external nodes ALWAYS delegate to `claude` — only Claude Code can execute slash-command skills.
183
187
  `session.cli_tool` is for analysis-mode delegates (e.g., decision evaluation in ralph), NOT for external node execution.
184
188
 
189
+ HARD RULE: Delegate sessions are non-interactive and cannot confirm prompts. External nodes MUST always append `-y`, regardless of whether the user passed `-y`. Without it, delegate hangs indefinitely waiting for confirmation.
190
+
185
191
  ```
192
+ // Always apply auto flag — delegate sessions are non-interactive and cannot confirm
193
+ flag = auto_flag_map[next.skill] || "-y"
194
+ effective_args = `${next.args} ${flag}`
195
+
186
196
  Bash({
187
- command: `maestro delegate "Execute: Skill({ skill: \"${next.skill}\", args: \"${next.args}\" })
197
+ command: `maestro delegate "Execute: Skill({ skill: \"${next.skill}\", args: \"${effective_args}\" })
188
198
  Do NOT reimplement — invoke the skill command directly." --to claude --mode write`,
189
199
  run_in_background: true,
190
200
  timeout: 600000
@@ -190,6 +190,8 @@ Generate steps from `lifecycle_position` to target (default: `milestone-complete
190
190
 
191
191
  IMPORTANT: `external` ≠ single CLI tool call. It spawns a full Claude Code session that executes the skill command — the delegate session has complete skill access.
192
192
 
193
+ HARD RULE: Delegate sessions are non-interactive. All skills executed via delegate MUST always append `-y`, regardless of user flags. This is enforced by ralph-execute in Step 5c — ralph does not preset flags in steps.
194
+
193
195
  **Build rules:**
194
196
  1. Start from inferred position, skip completed stages
195
197
  2. After each decision-triggering stage, insert a decision node with `{ decision, retry_count: 0, max_retries: 2 }`
@@ -317,6 +319,8 @@ For quality-gate decisions (post-verify, post-business-test, post-review, post-t
317
319
  **Structural decisions → Step 3.5 (direct evaluation)**
318
320
  **Quality-gate decisions → delegate below:**
319
321
 
322
+ NOTE: This delegate uses `--mode analysis` (read-only) — the CLI tool won't trigger interactive prompts, so no extra `-y` needed. If this ever changes to write-mode delegate, ensure the target skill appends `-y`.
323
+
320
324
  **Result file mapping** (for delegate CONTEXT):
321
325
 
322
326
  | Decision type | Files to include |
@@ -139,9 +139,15 @@ After each barrier skill completes, read its artifacts and update `state.context
139
139
  }
140
140
  ```
141
141
 
142
- 7. **Initialize plan tracking** (dual-track: status.json + update_plan):
142
+ 7. **Initialize tracking** (goal constraint plan sub-items):
143
143
 
144
144
  ```
145
+ // Goal = outer constraint — ensures entire chain completes
146
+ functions.create_goal({
147
+ objective: `Maestro ${chain_name}: ${steps.length} steps [${steps.map(s => s.skill).join(' → ')}]`
148
+ })
149
+
150
+ // Plan = inner tracking — sub-step progress
145
151
  functions.update_plan({
146
152
  plan: steps.map((step, i) => ({
147
153
  id: `step-${i}`,
@@ -233,9 +239,12 @@ Object with all fields required: `status` ("completed"|"failed"), `skill_call` (
233
239
 
234
240
  ### Phase 3: Completion Report
235
241
 
236
- Finalize dual tracking:
242
+ Finalize tracking:
237
243
  - status.json: `state.status = 'completed'`
238
244
  - update_plan: all steps → `"completed"` (skipped steps also marked completed)
245
+ - **update_goal**: `functions.update_goal({ status: "complete" })` — release goal constraint
246
+
247
+ **Note**: Abort path (Phase 2 step 7) does NOT call `update_goal` — goal stays running for `--continue` resume.
239
248
 
240
249
  ```
241
250
  === COORDINATE COMPLETE ===