maestro-flow 0.3.39 → 0.3.41

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,631 @@
1
+ ---
2
+ name: maestro-collab
3
+ description: Multi-CLI collaborative analysis -- fan-out to multiple CLI tools, cross-verify, synthesize
4
+ argument-hint: "\"<requirement>\" [--tools gemini,qwen,claude] [--mode analysis|write] [--rule <template>] [-y]"
5
+ allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
6
+ ---
7
+
8
+ <purpose>
9
+ Wave-based multi-CLI collaboration using `spawn_agents_on_csv`. Diamond topology: parallel CLI fan-out (Wave 1), cross-verification (Wave 2), then unified synthesis (Wave 3).
10
+
11
+ Each CLI tool independently analyzes the same requirement from its own perspective. Results are cross-verified for conflicts, then synthesized into a single actionable output.
12
+
13
+ **Core workflow**: Parse Requirement -> CLI Fan-Out -> Cross-Verify -> Synthesize
14
+
15
+ ```
16
+ +---------------------------------------------------------------------------+
17
+ | COLLAB CSV WAVE WORKFLOW |
18
+ +---------------------------------------------------------------------------+
19
+ | |
20
+ | Phase 1: Requirement Resolution -> CSV |
21
+ | +-- Parse requirement and flags from arguments |
22
+ | +-- Select CLI tools (explicit --tools or auto-select) |
23
+ | +-- Load project context (project.md, specs, codebase) |
24
+ | +-- Generate tasks.csv with fan-out + verify + synthesis rows |
25
+ | +-- User validates tool selection (skip if -y) |
26
+ | |
27
+ | Phase 2: Wave Execution Engine |
28
+ | +-- Wave 1: CLI Fan-Out (parallel, 2-5 agents) |
29
+ | | +-- Each agent delegates to one CLI tool via exec_command |
30
+ | | +-- Same requirement, different CLI perspective |
31
+ | | +-- Results: per-tool findings + recommendations |
32
+ | +-- Wave 2: Cross-Verification (single agent) |
33
+ | | +-- Compare all CLI outputs for consensus/conflicts |
34
+ | | +-- Tag: [CONSENSUS] / [CONFLICT] / [UNIQUE] |
35
+ | | +-- Results: conflict matrix + agreement areas |
36
+ | +-- Wave 3: Synthesis (single agent) |
37
+ | | +-- Merge verified findings into actionable output |
38
+ | | +-- Resolve conflicts with evidence-weighted voting |
39
+ | | +-- Generate final collab-report.md |
40
+ | +-- discoveries.ndjson shared across all waves (append-only) |
41
+ | |
42
+ | Phase 3: Results Aggregation |
43
+ | +-- Export results.csv + collab-report.md |
44
+ | +-- Display summary with consensus level + next steps |
45
+ | |
46
+ +---------------------------------------------------------------------------+
47
+ ```
48
+
49
+ </purpose>
50
+
51
+ <context>
52
+ ```bash
53
+ $maestro-collab "analyze the auth module for security vulnerabilities"
54
+ $maestro-collab "design a caching strategy for the API layer" --tools gemini,qwen,claude
55
+ $maestro-collab -y "review error handling patterns across the codebase"
56
+ $maestro-collab "refactor user service to use repository pattern" --mode write --tools gemini,claude
57
+ ```
58
+
59
+ **Flags**:
60
+ - `--tools <list>`: Comma-separated CLI tools (default: auto-select top 3 enabled from cli-tools.json)
61
+ - `--mode analysis|write`: Delegate mode (default: analysis)
62
+ - `--rule <template>`: Shared rule template for all delegates
63
+ - `-y, --yes`: Skip all confirmations (auto mode)
64
+ - `-c, --concurrency N`: Max concurrent agents within each wave (default: 5)
65
+
66
+ **Auto-select logic** (when `--tools` omitted):
67
+ 1. Read `~/.maestro/cli-tools.json`
68
+ 2. Filter `enabled == true`
69
+ 3. Take first 3 tools in config order
70
+ 4. Exclude `api-endpoint` type tools when `--mode write`
71
+
72
+ **Output Directory**: `.workflow/.csv-wave/{session-id}/`
73
+ **Core Output**: `tasks.csv` + `results.csv` + `discoveries.ndjson` + `collab-report.md`
74
+ </context>
75
+
76
+ <csv_schema>
77
+
78
+ ### tasks.csv (Master State)
79
+
80
+ ```csv
81
+ id,title,description,tool,role,prompt,mode,rule,deps,context_from,wave,status,findings,recommendations,confidence,error
82
+ "1","CLI: gemini","Analyze requirement via gemini CLI","gemini","analyze","<full prompt>","analysis","","","","1","","","","",""
83
+ "2","CLI: qwen","Analyze requirement via qwen CLI","qwen","analyze","<full prompt>","analysis","","","","1","","","","",""
84
+ "3","CLI: claude","Analyze requirement via claude CLI","claude","analyze","<full prompt>","analysis","","","","1","","","","",""
85
+ "4","Cross-Verify","Compare all CLI outputs: tag consensus, conflicts, unique findings","","","","","","1;2;3","1;2;3","2","","","","",""
86
+ "5","Synthesis","Merge verified findings into actionable collab-report.md","","","","","","4","4","3","","","","",""
87
+ ```
88
+
89
+ **Columns**:
90
+
91
+ | Column | Phase | Description |
92
+ |--------|-------|-------------|
93
+ | `id` | Input | Unique task identifier |
94
+ | `title` | Input | Short task title |
95
+ | `description` | Input | Detailed instructions for this task |
96
+ | `tool` | Input | CLI tool name (wave 1 only) |
97
+ | `role` | Input | Delegate --role value |
98
+ | `prompt` | Input | Full 6-field prompt for delegate |
99
+ | `mode` | Input | analysis or write |
100
+ | `rule` | Input | --rule template name (optional) |
101
+ | `deps` | Input | Semicolon-separated dependency task IDs |
102
+ | `context_from` | Input | Semicolon-separated task IDs for prev_context |
103
+ | `wave` | Computed | Wave number (1=fan-out, 2=verify, 3=synthesis) |
104
+ | `status` | Output | pending -> completed / failed |
105
+ | `findings` | Output | Key findings summary (max 500 chars) |
106
+ | `recommendations` | Output | Per-tool recommendations |
107
+ | `confidence` | Output | Self-assessed confidence (0-100) |
108
+ | `error` | Output | Error message if failed |
109
+
110
+ ### Session Structure
111
+
112
+ ```
113
+ .workflow/.csv-wave/{YYYYMMDD}-collab-{slug}/
114
+ +-- tasks.csv
115
+ +-- results.csv
116
+ +-- discoveries.ndjson
117
+ +-- collab-report.md
118
+ +-- context.md ← standard Locked/Free/Deferred format (downstream compatible)
119
+ +-- conclusions.json ← structured conclusions (plan fast-track compatible)
120
+ +-- wave-{N}.csv (temporary)
121
+ +-- per-tool/
122
+ +-- gemini-output.md
123
+ +-- qwen-output.md
124
+ +-- claude-output.md
125
+ ```
126
+
127
+ ### Downstream Compatibility
128
+
129
+ | Consumer | Consumption Path | Artifact |
130
+ |----------|-----------------|----------|
131
+ | **maestro-plan** | `$maestro-plan "N --dir .workflow/scratch/{collab-session}/"` | `context.md` + `conclusions.json` |
132
+ | **maestro-analyze** | auto via `state.json.artifacts[]` (type=collab) | `context.md` as prior context |
133
+ | **maestro-brainstorm** | auto via `state.json.artifacts[]` (type=collab) | `context.md` as supplementary context |
134
+ | **maestro-ralph** | auto — lifecycle position inference includes collab | artifact chain lookup |
135
+
136
+ `context.md` uses the standard Locked/Free/Deferred decision format. `conclusions.json` follows the same schema as maestro-analyze's output. This allows plan to skip wave 1 exploration when collab has already produced structured conclusions.
137
+ </csv_schema>
138
+
139
+ <invariants>
140
+ 1. **Plan Before Execute**: Present collaboration plan with tool selection for user approval before any CLI invocation
141
+ 2. **Wave Order is Sacred**: Never execute wave 2 before wave 1 completes
142
+ 3. **CSV is Source of Truth**: Master tasks.csv holds all state
143
+ 4. **Context Propagation**: prev_context built from master CSV, not from memory
144
+ 5. **Discovery Board is Append-Only**: Never modify or delete discoveries.ndjson
145
+ 6. **Same Prompt, Different Tool**: Wave 1 agents all use the same base prompt, only --to differs
146
+ 7. **Minimum 2 Tools**: Collaboration requires at least 2 CLI tools; abort if fewer enabled
147
+ 8. **Delegate Protocol**: All exec_command calls follow delegate-protocol.codex.md (yield_time + poll)
148
+ 9. **DO NOT STOP**: Continuous execution until all waves complete
149
+ 10. **Partial Degradation**: If 1+ tool fails in wave 1, continue with available results
150
+ </invariants>
151
+
152
+ <execution>
153
+
154
+ ### Session Initialization
155
+
156
+ **Parse from `$ARGUMENTS`**:
157
+
158
+ | Variable | Source | Default |
159
+ |----------|--------|---------|
160
+ | `AUTO_YES` | `--yes` or `-y` | false |
161
+ | `maxConcurrency` | `--concurrency N` or `-c N` | 5 |
162
+ | `selectedTools` | `--tools <list>` | auto-select |
163
+ | `delegateMode` | `--mode` | `analysis` |
164
+ | `ruleTemplate` | `--rule` | null |
165
+ | `requirement` | remaining text after flag removal | "" (E001 if empty) |
166
+
167
+ **Auto-bootstrap**: If `.workflow/` missing, create minimal structure.
168
+
169
+ **Session paths** (UTC+8 date prefix):
170
+ - `slug` ← requirement kebab-cased, max 40 chars
171
+ - `sessionFolder`: `.workflow/.csv-wave/{YYYYMMDD}-collab-{slug}/`
172
+
173
+ - `scratchDir`: `.workflow/scratch/{YYYYMMDD}-collab-{slug}/`
174
+
175
+ Create `sessionFolder` + `sessionFolder/per-tool/` + `scratchDir`.
176
+
177
+ ### Phase 1: Requirement Resolution -> CSV
178
+
179
+ **Objective**: Parse requirement, discover available tools, present plan for user approval, generate tasks.csv.
180
+
181
+ **1. Discover available CLI tools**:
182
+
183
+ Read `~/.maestro/cli-tools.json` → extract all tool entries. Build `availableTools[]`:
184
+
185
+ ```
186
+ For each tool in config.tools:
187
+ availableTools.push({
188
+ name: tool.name,
189
+ enabled: tool.enabled,
190
+ type: tool.type, // builtin | cli-wrapper | api-endpoint
191
+ model: tool.primaryModel,
192
+ tags: tool.tags, // [fullstack, frontend, backend, ...]
193
+ eligible: tool.enabled
194
+ && (delegateMode != "write" || tool.type != "api-endpoint")
195
+ })
196
+ ```
197
+
198
+ Validate: at least 2 eligible tools required (E002 if fewer).
199
+
200
+ **2. Auto-recommend tool selection**:
201
+
202
+ | Source | Logic |
203
+ |--------|-------|
204
+ | `--tools` explicit | Use provided list, validate each is eligible |
205
+ | No `--tools` | Take first 3 eligible tools in config order |
206
+
207
+ Mark each eligible tool as `recommended: true/false` based on auto-selection.
208
+
209
+ **3. Context loading**:
210
+ - Read `.workflow/project.md` if exists
211
+ - Load project specs: `maestro spec load --category arch,coding` (if available)
212
+ - Grep for relevant codebase files based on requirement keywords
213
+
214
+ **4. Build delegate prompt** (shared across all tools):
215
+
216
+ ```
217
+ PURPOSE: {requirement}; success = actionable findings with evidence
218
+ TASK: {auto-decomposed from requirement into 3-5 specific verbs}
219
+ MODE: {delegateMode}
220
+ CONTEXT: @**/* | Memory: {project context if available}
221
+ EXPECTED: Structured findings with file:line references, confidence score (0-100), prioritized recommendations
222
+ CONSTRAINTS: {from requirement} | Output findings as structured text with sections: ## Findings, ## Recommendations, ## Confidence
223
+ ```
224
+
225
+ **5. Present Collaboration Plan** (skip if AUTO_YES):
226
+
227
+ Display plan summary, then `request_user_input` for approval:
228
+
229
+ ```
230
+ ============================================================
231
+ COLLABORATION PLAN
232
+ ============================================================
233
+ Requirement: {requirement}
234
+ Mode: {delegateMode}
235
+ Rule: {ruleTemplate || "none"}
236
+
237
+ Available CLI Tools (from cli-tools.json):
238
+ [✓] gemini — gemini-3.1-pro-preview [fullstack, frontend]
239
+ [✓] claude — claude-sonnet-4-6 [fullstack]
240
+ [✓] codex — gpt-5.5 [fullstack, backend]
241
+ [ ] opencode — (no model) [fullstack]
242
+
243
+ Selected: gemini, claude, codex (3 tools)
244
+
245
+ Pipeline:
246
+ Wave 1: Fan-out → gemini + claude + codex (parallel)
247
+ Wave 2: Cross-verification (conflicts/consensus)
248
+ Wave 3: Synthesis → context.md + conclusions.json
249
+
250
+ Prompt Preview:
251
+ PURPOSE: {first 80 chars}...
252
+ TASK: {task verbs}
253
+ ============================================================
254
+ ```
255
+
256
+ ```json
257
+ request_user_input({
258
+ "questions": [{
259
+ "id": "collab_plan",
260
+ "header": "Collaboration Plan",
261
+ "question": "以上为协作计划。如何继续?",
262
+ "options": [
263
+ {
264
+ "label": "执行 (Recommended)",
265
+ "description": "使用选中的 {N} 个 CLI 工具开始协作分析"
266
+ },
267
+ {
268
+ "label": "修改工具选择",
269
+ "description": "更改参与协作的 CLI 工具组合"
270
+ },
271
+ {
272
+ "label": "取消",
273
+ "description": "中止协作,不执行任何调用"
274
+ }
275
+ ]
276
+ }]
277
+ })
278
+ ```
279
+
280
+ **Handle user response**:
281
+
282
+ | Response | Action |
283
+ |----------|--------|
284
+ | **执行** | Proceed to step 6 (CSV generation) |
285
+ | **修改工具选择** | → Tool Modification Interaction (step 5a) |
286
+ | **取消** | Abort with message "协作已取消" |
287
+
288
+ #### 5a. Tool Modification Interaction
289
+
290
+ Present all eligible tools as toggleable options:
291
+
292
+ ```json
293
+ request_user_input({
294
+ "questions": [{
295
+ "id": "tool_selection",
296
+ "header": "CLI Tool Selection",
297
+ "question": "选择参与协作的 CLI 工具(至少 2 个):",
298
+ "options": [
299
+ { "label": "gemini", "description": "gemini-3.1-pro-preview — fullstack, frontend" },
300
+ { "label": "claude", "description": "claude-sonnet-4-6 — fullstack" },
301
+ { "label": "codex", "description": "gpt-5.5 — fullstack, backend" },
302
+ { "label": "opencode", "description": "(no model) — fullstack" }
303
+ ]
304
+ }]
305
+ })
306
+ ```
307
+
308
+ Options are **dynamically built** from `availableTools.filter(t => t.eligible)`:
309
+ - `label` = tool name
310
+ - `description` = `{primaryModel} — {tags.join(", ")}`
311
+
312
+ Parse user selection → update `selectedTools`. Validate minimum 2 (re-prompt if fewer).
313
+ Return to step 5 to re-display updated plan.
314
+
315
+ **6. CSV generation**:
316
+ - N tool rows (wave 1, one per selected tool)
317
+ - 1 cross-verify row (wave 2, deps on all wave 1)
318
+ - 1 synthesis row (wave 3, deps on wave 2)
319
+
320
+ ### Phase 2: Wave Execution Engine
321
+
322
+ #### Wave 1: CLI Fan-Out (Parallel)
323
+
324
+ Filter `wave == 1 && status == pending` from master CSV. Write `wave-1.csv`.
325
+
326
+ Each wave 1 agent:
327
+
328
+ 1. Read task row: extract `tool`, `prompt`, `mode`, `rule`
329
+ 2. Execute delegate (blocking):
330
+
331
+ ```
332
+ exec_command({
333
+ cmd: `maestro delegate "${prompt}" --to ${tool} --mode ${mode} ${rule ? '--rule ' + rule : ''}`,
334
+ yield_time_ms: 30000,
335
+ max_output_tokens: 6000
336
+ })
337
+ // If session_id returned -> poll write_stdin until completion
338
+ // See @~/.maestro/workflows/delegate-protocol.codex.md
339
+ ```
340
+
341
+ 3. Parse delegate output
342
+ 4. Write per-tool output to `per-tool/{tool}-output.md`
343
+ 5. Share findings via discovery board
344
+
345
+ ```javascript
346
+ spawn_agents_on_csv({
347
+ csv_path: `${sessionFolder}/wave-1.csv`,
348
+ id_column: "id",
349
+ instruction: buildFanOutInstruction(sessionFolder),
350
+ max_concurrency: maxConcurrency,
351
+ max_runtime_seconds: 3600,
352
+ output_csv_path: `${sessionFolder}/wave-1-results.csv`,
353
+ output_schema: { id, status: ["completed"|"failed"], findings, recommendations, confidence, error }
354
+ })
355
+ ```
356
+
357
+ Merge results into master `tasks.csv`, delete `wave-1.csv`.
358
+
359
+ **Fan-Out Agent Instruction**:
360
+
361
+ ```
362
+ You are a CLI collaboration agent. Your task is to delegate analysis to a specific CLI tool and capture its output.
363
+
364
+ 1. Read your task row for: tool, prompt, mode, rule
365
+ 2. Execute the delegate call using exec_command (follow delegate-protocol.codex.md):
366
+ exec_command({
367
+ cmd: `maestro delegate "<prompt>" --to <tool> --mode <mode> [--rule <rule>]`,
368
+ yield_time_ms: 30000, max_output_tokens: 6000
369
+ })
370
+ 3. If session_id returned, poll via write_stdin until completion
371
+ 4. Write full output to {sessionFolder}/per-tool/{tool}-output.md
372
+ 5. Extract: findings (key points), recommendations (actionable items), confidence (0-100)
373
+ 6. Share via discoveries.ndjson: type="cli_finding", data={tool, dimension, finding, confidence}
374
+ 7. Report result with findings, recommendations, confidence
375
+ ```
376
+
377
+ #### Wave 2: Cross-Verification (Single Agent)
378
+
379
+ Filter `wave == 2 && status == pending`. Build `prev_context` from wave 1 findings.
380
+
381
+ ```javascript
382
+ spawn_agents_on_csv({
383
+ csv_path: `${sessionFolder}/wave-2.csv`,
384
+ id_column: "id",
385
+ instruction: buildCrossVerifyInstruction(sessionFolder),
386
+ max_concurrency: 1,
387
+ max_runtime_seconds: 3600,
388
+ output_csv_path: `${sessionFolder}/wave-2-results.csv`,
389
+ output_schema: { id, status: ["completed"|"failed"], findings, recommendations, confidence, error }
390
+ })
391
+ ```
392
+
393
+ **Cross-Verify Agent Instruction**:
394
+
395
+ ```
396
+ You are a cross-verification agent. Compare outputs from multiple CLI tools.
397
+
398
+ 1. Read all per-tool outputs from {sessionFolder}/per-tool/
399
+ 2. Read discoveries.ndjson for shared findings
400
+ 3. For each finding across tools, classify:
401
+ - [CONSENSUS]: 2+ tools agree on same finding/recommendation
402
+ - [CONFLICT]: Tools disagree on approach/assessment
403
+ - [UNIQUE]: Finding from only one tool (may be valuable or noise)
404
+ 4. For [CONFLICT] items: note each tool's position and evidence strength
405
+ 5. Compute consensus_level: (consensus_count / total_findings) * 100
406
+ 6. Write findings as structured text:
407
+ ## Consensus Areas
408
+ ## Conflicts (with per-tool positions)
409
+ ## Unique Findings (with source tool)
410
+ ## Consensus Level: {N}%
411
+ ```
412
+
413
+ Merge results into master `tasks.csv`, delete `wave-2.csv`.
414
+
415
+ #### Wave 3: Synthesis (Single Agent)
416
+
417
+ Filter `wave == 3 && status == pending`. Build `prev_context` from wave 2 findings.
418
+
419
+ ```javascript
420
+ spawn_agents_on_csv({
421
+ csv_path: `${sessionFolder}/wave-3.csv`,
422
+ id_column: "id",
423
+ instruction: buildSynthesisInstruction(sessionFolder),
424
+ max_concurrency: 1,
425
+ max_runtime_seconds: 3600,
426
+ output_csv_path: `${sessionFolder}/wave-3-results.csv`,
427
+ output_schema: { id, status: ["completed"|"failed"], findings, recommendations, confidence, error }
428
+ })
429
+ ```
430
+
431
+ **Synthesis Agent Instruction**:
432
+
433
+ ```
434
+ You are a synthesis agent. Merge cross-verified findings into a final report.
435
+
436
+ 1. Read cross-verification results from prev_context
437
+ 2. Read all per-tool outputs from {sessionFolder}/per-tool/
438
+ 3. Read discoveries.ndjson
439
+ 4. Resolve [CONFLICT] items via evidence-weighted voting:
440
+ - Higher confidence tool's position wins
441
+ - More specific evidence (file:line refs) wins over general statements
442
+ - If tied: present both with [SUGGESTED] tag
443
+ 5. Generate collab-report.md:
444
+
445
+ # Multi-CLI Collaboration Report -- {requirement}
446
+
447
+ ## Summary
448
+ - Tools: {tool_list}
449
+ - Consensus level: {N}%
450
+ - Key finding: {top finding}
451
+
452
+ ## Consensus Findings
453
+ {merged findings agreed by 2+ tools}
454
+
455
+ ## Resolved Conflicts
456
+ {conflicts resolved with rationale}
457
+
458
+ ## Unresolved Items
459
+ {items requiring human judgment}
460
+
461
+ ## Unique Insights
462
+ {valuable unique findings with source attribution}
463
+
464
+ ## Recommendations
465
+ {prioritized, merged recommendations}
466
+
467
+ ## Per-Tool Confidence
468
+ | Tool | Confidence | Key Strength |
469
+ |------|-----------|--------------|
470
+
471
+ 6. Generate context.md (standard downstream format):
472
+
473
+ # Context: {requirement}
474
+
475
+ **Date**: {date}
476
+ **Mode**: collab ({tool_list})
477
+ **Consensus Level**: {N}%
478
+
479
+ ## Decisions
480
+
481
+ ### Decision N: {TITLE}
482
+ - **Context**: {what and why}
483
+ - **Options**: 1. {opt1} 2. {opt2}
484
+ - **Chosen**: {selected — from consensus or evidence-weighted resolution}
485
+ - **Reason**: {rationale — include which tools agreed/disagreed}
486
+
487
+ ## Constraints
488
+
489
+ ### Locked
490
+ {[CONSENSUS] items — agreed by 2+ tools, treat as confirmed decisions}
491
+
492
+ ### Free
493
+ {[UNIQUE] items with strong evidence — implementer may adopt or skip}
494
+
495
+ ### Deferred
496
+ {[UNRESOLVED] conflicts — require human judgment before proceeding}
497
+
498
+ ## Code Context
499
+ {file:line references from per-tool findings}
500
+
501
+ 7. Generate conclusions.json (plan fast-track compatible):
502
+
503
+ {
504
+ "session_id": "<session>",
505
+ "subject": "<requirement>",
506
+ "mode": "collab",
507
+ "tools": ["gemini", "qwen", "claude"],
508
+ "consensus_level": 85,
509
+ "recommendation": "Go|No-Go|Conditional",
510
+ "confidence": "high|medium|low",
511
+ "dimensions": [
512
+ { "name": "<tool>", "score": 80, "findings": "...", "recommendations": "..." }
513
+ ],
514
+ "decisions": [
515
+ { "title": "...", "classification": "locked|free|deferred", "source_tools": ["gemini","qwen"], "rationale": "..." }
516
+ ],
517
+ "timestamp": "<ISO>"
518
+ }
519
+
520
+ 8. Write collab-report.md, context.md, conclusions.json to {sessionFolder}/
521
+ ```
522
+
523
+ Merge results into master `tasks.csv`, delete `wave-3.csv`.
524
+
525
+ ### Phase 3: Results Aggregation
526
+
527
+ 1. Export final `tasks.csv` as `results.csv`
528
+ 2. Verify `collab-report.md` + `context.md` + `conclusions.json` exist (if synthesis failed, build minimal versions from available findings)
529
+ 3. Copy final outputs to `scratchDir`:
530
+ - `collab-report.md` → `{scratchDir}/collab-report.md`
531
+ - `context.md` → `{scratchDir}/context.md`
532
+ - `conclusions.json` → `{scratchDir}/conclusions.json`
533
+
534
+ 4. **Register artifact in state.json**:
535
+ ```json
536
+ {
537
+ "id": "CLB-{next_id}",
538
+ "type": "collab",
539
+ "milestone": "{current_milestone}",
540
+ "phase": null,
541
+ "scope": "adhoc",
542
+ "path": "scratch/{YYYYMMDD}-collab-{slug}",
543
+ "status": "completed",
544
+ "depends_on": null,
545
+ "harvested": false,
546
+ "created_at": "<ISO>",
547
+ "completed_at": "<ISO>"
548
+ }
549
+ ```
550
+
551
+ 5. **Spec Enrichment**: For each Locked decision in context.md:
552
+ - `maestro spec add arch "<decision.title>" "<decision.rationale>" --keywords ... --source collab:{sessionId}`
553
+
554
+ 6. Display summary:
555
+
556
+ ```
557
+ ============================================================
558
+ MULTI-CLI COLLABORATION COMPLETE
559
+ ============================================================
560
+ Requirement: {requirement}
561
+ Tools: {tool_list}
562
+ Consensus Level: {N}%
563
+ Wave Results: {completed}/{total} tasks
564
+
565
+ Per-Tool:
566
+ gemini: {status} (confidence: {N}%)
567
+ qwen: {status} (confidence: {N}%)
568
+ claude: {status} (confidence: {N}%)
569
+
570
+ Artifact: CLB-{id} registered in state.json
571
+ Output: {scratchDir}/
572
+
573
+ Next steps:
574
+ $maestro-analyze "{topic}" -- Deep feasibility analysis
575
+ $maestro-plan "{phase} --dir {scratchDir}" -- Plan from collab conclusions
576
+ $maestro-brainstorm "{topic}" -- Expand with multi-role brainstorm
577
+ ============================================================
578
+ ```
579
+
580
+ ### Shared Discovery Board Protocol
581
+
582
+ #### Domain Discovery Types
583
+
584
+ | Type | Dedup Key | Data Schema | Description |
585
+ |------|-----------|-------------|-------------|
586
+ | `cli_finding` | `data.tool+data.dimension` | `{tool, dimension, finding, confidence, evidence}` | Per-tool finding |
587
+ | `consensus` | `data.area` | `{area, tools[], finding, confidence}` | Cross-tool agreement |
588
+ | `conflict` | `data.area` | `{area, positions[{tool, stance, evidence}], resolution}` | Cross-tool disagreement |
589
+ | `unique_insight` | `data.tool+data.finding` | `{tool, finding, significance, actionable}` | Single-tool unique finding |
590
+
591
+ #### Protocol
592
+
593
+ Read `discoveries.ndjson` before analysis. Append-only: dedup by type+key, never modify/delete.
594
+
595
+ </execution>
596
+
597
+ <error_codes>
598
+
599
+ | Code | Severity | Description | Recovery |
600
+ |------|----------|-------------|----------|
601
+ | E001 | error | Requirement argument missing | Prompt for requirement |
602
+ | E002 | error | Fewer than 2 CLI tools available | Check cli-tools.json, enable more tools |
603
+ | E003 | error | Specified tool not found/enabled | Show available tools |
604
+ | E004 | error | All wave 1 delegates failed | Abort with per-tool error details |
605
+ | W001 | warning | One tool failed in wave 1 | Continue with remaining tools |
606
+ | W002 | warning | Cross-verify found >50% conflicts | Highlight in report, recommend manual review |
607
+ | W003 | warning | Synthesis agent failed | Use cross-verify output as fallback report |
608
+ | W004 | warning | Low consensus level (<40%) | Flag in summary, tools may need different prompts |
609
+
610
+ </error_codes>
611
+
612
+ <success_criteria>
613
+ - [ ] Session folder created with valid tasks.csv
614
+ - [ ] Available CLI tools discovered from cli-tools.json with eligibility filtering
615
+ - [ ] Collaboration plan presented via request_user_input (tool list, pipeline, prompt preview)
616
+ - [ ] User approved or modified tool selection before execution
617
+ - [ ] CLI tools finalized (auto or user-modified) with minimum 2
618
+ - [ ] All wave 1 delegates executed via delegate-protocol.codex.md (blocking poll)
619
+ - [ ] Per-tool outputs written to per-tool/{tool}-output.md
620
+ - [ ] Cross-verification completed with consensus/conflict/unique classification
621
+ - [ ] Synthesis produced collab-report.md with merged findings
622
+ - [ ] context.md produced in standard Locked/Free/Deferred format (downstream compatible)
623
+ - [ ] conclusions.json produced with per-tool dimensions and decision trail (plan fast-track compatible)
624
+ - [ ] Consensus level computed and displayed
625
+ - [ ] Results.csv exported with all task statuses
626
+ - [ ] CLB artifact registered in state.json
627
+ - [ ] Final outputs copied to scratchDir (collab-report.md, context.md, conclusions.json)
628
+ - [ ] Spec enrichment applied for Locked decisions
629
+ - [ ] discoveries.ndjson append-only throughout
630
+ - [ ] Partial degradation: continue if 1+ tools succeed in wave 1
631
+ </success_criteria>
@@ -359,6 +359,12 @@ spawn_agents_on_csv({
359
359
  1. **Plan checking** (inline, not a separate wave):
360
360
  Read `plan.json` + all `.task/TASK-*.json`. Validate: requirements coverage, file feasibility, dependency correctness (no cycles, valid wave order), grep-verifiable convergence criteria, read_first completeness, action concreteness, no parallel file conflicts, **task count within complexity threshold** (reject over-split plans), **no per-file splitting** (each task must be feature-level).
361
361
 
362
+ 1b. **Plan confidence scoring**:
363
+
364
+ Dimensions (5): requirements_coverage, task_quality, dependency_correctness, estimation_accuracy, collision_safety. Factors (weights): completeness(.30), specificity(.25), structural_validity(.20), user_validation(.15), consistency(.10). Add `confidence` section to `plan.json`.
365
+
366
+ **Readiness gate**: Block if requirements_coverage < 40% or any task missing read_first/convergence.criteria.
367
+
362
368
  2. **Revision loop** (max 3 rounds): If critical issues found, regenerate affected tasks.
363
369
 
364
370
  2b. **Spec Enrichment**: Persist cross-task reusable design decisions:
@@ -397,6 +403,7 @@ spawn_agents_on_csv({
397
403
  Tasks: {task_count} tasks in {wave_count} waves
398
404
  Check: {checker_status} (iteration {check_count}/{max_checks})
399
405
  Collision: {collision_status}
406
+ Confidence: {overall}% (weakest: {dim})
400
407
 
401
408
  Next steps:
402
409
  $maestro-execute "{phase}" -- Execute the plan
@@ -464,6 +471,9 @@ echo '{"ts":"<ISO>","worker":"{id}","type":"existing_pattern","data":{"name":"Re
464
471
  - [ ] plan.json produced in phase directory
465
472
  - [ ] .task/TASK-*.json files produced for all tasks
466
473
  - [ ] Plan passes quality checks (coverage, deps, criteria)
474
+ - [ ] Plan confidence scored with 5-dimension factor model
475
+ - [ ] Readiness gate checked before confirmation
476
+ - [ ] plan.json includes confidence section
467
477
  - [ ] Collision detection executed against same-milestone plans
468
478
  - [ ] PLN artifact registered in state.json
469
479
  - [ ] context.md produced with exploration findings + plan overview
@@ -153,6 +153,13 @@ Load session state by explicit ID or most recent `MCP-*/state.json` with `status
153
153
  4. Group into waves: barrier nodes → solo wave, non-barrier nodes → accumulate into parallel wave
154
154
  5. Build steps array from waves, write `state.json`
155
155
 
156
+ **Step 2.5a — Register goal constraint**:
157
+ ```
158
+ functions.create_goal({
159
+ objective: `Player ${template_name}: ${steps.length} steps from template ${template_id}`
160
+ })
161
+ ```
162
+
156
163
  **Step 2.6** — Display start banner:
157
164
  ```
158
165
  ============================================================
@@ -272,6 +279,9 @@ const RESULT_SCHEMA = {
272
279
  ```
273
280
 
274
281
  Update `state.status = "completed"`, write final `state.json`.
282
+ Release goal constraint: `functions.update_goal({ status: "complete" })`
283
+
284
+ **Note**: Abort path (Phase 3 step 3g) does NOT call `update_goal` — goal stays running for `-c` resume.
275
285
  </execution>
276
286
 
277
287
  <csv_schema>