npm - maestro-flow - Versions diffs - 0.4.17 → 0.4.18 - Mend

maestro-flow 0.4.17 → 0.4.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (80) hide show

package/.agents/skills/maestro/SKILL.md +1 -1
package/.agents/skills/maestro-analyze/SKILL.md +5 -0
package/.agents/skills/maestro-blueprint/SKILL.md +5 -0
package/.agents/skills/maestro-brainstorm/SKILL.md +5 -0
package/.agents/skills/maestro-next/SKILL.md +219 -0
package/.agy/skills/maestro/SKILL.md +1 -1
package/.agy/skills/maestro-analyze/SKILL.md +5 -0
package/.agy/skills/maestro-blueprint/SKILL.md +5 -0
package/.agy/skills/maestro-brainstorm/SKILL.md +5 -0
package/.agy/skills/maestro-next/SKILL.md +215 -0
package/.claude/commands/maestro-analyze.md +5 -0
package/.claude/commands/maestro-blueprint.md +5 -0
package/.claude/commands/maestro-brainstorm.md +5 -0
package/.claude/commands/maestro-next.md +217 -0
package/.claude/commands/maestro.md +1 -1
package/.codex/skills/learn-decompose/SKILL.md +34 -3
package/.codex/skills/learn-retro/SKILL.md +31 -1
package/.codex/skills/learn-second-opinion/SKILL.md +34 -4
package/.codex/skills/maestro-analyze/SKILL.md +44 -5
package/.codex/skills/maestro-blueprint/SKILL.md +5 -0
package/.codex/skills/maestro-brainstorm/SKILL.md +46 -0
package/.codex/skills/maestro-execute/SKILL.md +61 -5
package/.codex/skills/maestro-milestone-audit/SKILL.md +64 -13
package/.codex/skills/maestro-milestone-complete/SKILL.md +12 -0
package/.codex/skills/maestro-plan/SKILL.md +36 -1
package/.codex/skills/maestro-player/SKILL.md +25 -6
package/.codex/skills/maestro-ralph/SKILL.md +17 -10
package/.codex/skills/maestro-ralph-execute/SKILL.md +2 -1
package/.codex/skills/maestro-roadmap/SKILL.md +35 -4
package/.codex/skills/maestro-ui-codify/SKILL.md +38 -10
package/.codex/skills/maestro-verify/SKILL.md +40 -5
package/.codex/skills/manage-codebase-rebuild/SKILL.md +52 -5
package/.codex/skills/manage-issue-discover/SKILL.md +106 -15
package/.codex/skills/quality-auto-test/SKILL.md +70 -16
package/.codex/skills/quality-debug/SKILL.md +139 -28
package/.codex/skills/quality-refactor/SKILL.md +61 -11
package/.codex/skills/quality-review/SKILL.md +45 -9
package/.codex/skills/quality-test/SKILL.md +58 -3
package/.codex/skills/security-audit/SKILL.md +38 -0
package/.codex/skills/spec-map/SKILL.md +65 -8
package/.codex/skills/team-coordinate/SKILL.md +28 -11
package/.codex/skills/team-coordinate/specs/role-catalog.md +20 -0
package/.codex/skills/team-lifecycle-v4/SKILL.md +23 -7
package/.codex/skills/team-lifecycle-v4/instructions/agent-instruction.md +20 -0
package/.codex/skills/team-quality-assurance/SKILL.md +40 -2
package/.codex/skills/team-review/SKILL.md +42 -2
package/.codex/skills/team-tech-debt/SKILL.md +45 -2
package/.codex/skills/team-testing/SKILL.md +42 -2
package/dashboard/dist-server/dashboard/src/server/wiki/search.d.ts +6 -4
package/dashboard/dist-server/dashboard/src/server/wiki/search.js +50 -8
package/dashboard/dist-server/dashboard/src/server/wiki/search.js.map +1 -1
package/dashboard/dist-server/dashboard/src/server/wiki/virtual-wiki-adapters.d.ts +32 -0
package/dashboard/dist-server/dashboard/src/server/wiki/virtual-wiki-adapters.js +294 -0
package/dashboard/dist-server/dashboard/src/server/wiki/virtual-wiki-adapters.js.map +1 -1
package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.d.ts +1 -0
package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.js +35 -1
package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.js.map +1 -1
package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.test.js +235 -0
package/dashboard/dist-server/dashboard/src/server/wiki/wiki-indexer.test.js.map +1 -1
package/dist/src/ralph/cmd-check.js +1 -1
package/dist/src/ralph/cmd-check.js.map +1 -1
package/dist/src/ralph/cmd-complete.js +1 -1
package/dist/src/ralph/cmd-complete.js.map +1 -1
package/dist/src/ralph/cmd-next.d.ts.map +1 -1
package/dist/src/ralph/cmd-next.js +12 -4
package/dist/src/ralph/cmd-next.js.map +1 -1
package/dist/src/ralph/cmd-session.js +2 -2
package/dist/src/ralph/cmd-session.js.map +1 -1
package/dist/src/ralph/status-store.d.ts +8 -1
package/dist/src/ralph/status-store.d.ts.map +1 -1
package/dist/src/ralph/status-store.js +12 -2
package/dist/src/ralph/status-store.js.map +1 -1
package/dist/src/tools/store-knowhow.d.ts.map +1 -1
package/dist/src/tools/store-knowhow.js +51 -64
package/dist/src/tools/store-knowhow.js.map +1 -1
package/dist/src/utils/update-notices.js +12 -0
package/dist/src/utils/update-notices.js.map +1 -1
package/package.json +1 -1
package/workflows/finish-work.md +119 -0
package/workflows/milestone-complete.md +23 -1

package/.claude/commands/maestro-next.md ADDED Viewed

@@ -0,0 +1,217 @@
+---
+name: maestro-next
+description: Single-command recommendation — pick the best next command from the pool and execute it
+argument-hint: "<intent> [-y] [--dry-run] [--top N] [--list]"
+allowed-tools:
+  - Read
+  - Bash
+  - Glob
+  - Grep
+  - Skill
+  - AskUserQuestion
+---
+<purpose>
+单链推荐：根据用户输入意图，从 `.claude/commands/` 命令池中挑选**最匹配的单个命令**，确认后通过 `Skill()` 执行。
+与 `/maestro` / `/maestro-ralph` 的区别：
+- `/maestro`、`/maestro-ralph`、`/maestro-ralph-execute`、`/maestro-ralph-beta`、`/maestro-player`、`/maestro-composer` 是**多步管线编排器**，本命令不会推荐它们
+- 本命令始终只推荐 **1 个原子命令**（top pick），最多列出 2-3 个备选；选定后直接执行，无 session、无 chain
+- 适用场景：用户意图清晰、只需单步即可完成；或不确定该走哪个具体命令时获取定向推荐
+</purpose>
+<context>
+$ARGUMENTS — 用户意图文本 + 可选 flags。
+**Flags:**
+- `-y` / `--yes` — 自动模式：跳过确认，直接执行 top pick
+- `--dry-run` — 仅显示推荐结果，不执行
+- `--top N` — 显示前 N 个候选供选择（默认 3）
+- `--list` — 仅列出可推荐命令池，不做推荐
+**候选池：** 仅 Step 3 路由表中列出的命令参与推荐。表中未出现的命令（含管线编排器 `maestro` / `maestro-ralph*` / `maestro-player` / `maestro-composer` 等）不会被本命令推荐。
+</context>
+<execution>
+### Step 1: Parse Arguments
+解析 `-y` / `--dry-run` / `--top N` / `--list`，剩余文本作为 `intent`。
+- `--list` 模式 → 跳到 Step 3（仅列表）
+- `intent` 为空且非 `--list` → `AskUserQuestion`：让用户输入意图（最多 1 轮，仍空则 E001）
+### Step 2: 读取 Workflow 状态（智能推荐基础）
+读取以下项目状态，用于推断"当前生命周期位置"和"自然下一步"：
+```bash
+# 1. 当前 phase / milestone / 最新 artifact
+cat .workflow/state.json 2>$null
+# 2. 最近 artifact 目录（按 mtime 倒序，取前 3）
+ls -la .workflow/scratch/ 2>$null | head -10
+# 3. 是否有进行中的 ralph/maestro session
+ls -la .workflow/.maestro/ 2>$null | head -5
+```
+根据读取结果推断 **lifecycle_position**（用作下一步推荐的核心信号）：
+| 项目状态特征 | lifecycle_position | 自然下一步 |
+|--------------|-------------------|-----------|
+| 无 `.workflow/` + 无源码 | `brainstorm` | `maestro-brainstorm` |
+| 无 `.workflow/` + 有源码 | `init` | `maestro-init` |
+| 有 state.json，无 roadmap.md，无 milestones | `analyze-macro` | `maestro-analyze` (宏观调研) |
+| 有 macro analyze artifact，无 roadmap.md | `roadmap` | `maestro-roadmap` |
+| 有 roadmap，未启动 phase | `analyze` | `maestro-analyze {phase}` |
+| 最新 artifact = `analyze` | `plan` | `maestro-plan {phase}` |
+| 最新 artifact = `plan` | `execute` | `maestro-execute {phase}` |
+| 最新 artifact = `execute` | `verify` | `maestro-verify {phase}` |
+| 最新 artifact = `verify`，passed | `review` | `quality-review {phase}` |
+| 最新 artifact = `review`，verdict=PASS | `test-gen` | `quality-auto-test {phase}` |
+| 最新 artifact = `test`，全绿 | `milestone-audit` | `maestro-milestone-audit` |
+| 当前 milestone 全部 phase 完成 | `milestone-complete` | `maestro-milestone-complete` |
+| 任一 stage 产物含 gaps/failed | `debug` | `quality-debug {gap}` |
+**Maestro Lifecycle 主线（核心 workflow，供推断"下一步"使用）：**
+```
+brainstorm → blueprint → init → analyze-macro → roadmap
+   → [per phase] analyze → plan → execute → verify
+   → [quality gate] review → auto-test → test
+   → milestone-audit → milestone-complete → milestone-release
+```
+**辅助 workflow 簇**（按场景触发，非主线）：
+| Workflow 簇 | 触发场景 | 主推命令 |
+|-------------|---------|---------|
+| Learning | 接触新代码/未知模块 | `learn-follow` → `learn-decompose` → `learn-second-opinion` |
+| Knowledge | 提炼经验 / 沉淀知识 | `manage-harvest` → `manage-knowhow-capture` → `spec-add` |
+| Wiki 维护 | 知识图谱整理 | `manage-wiki` → `wiki-connect` → `wiki-digest` |
+| Issue 治理 | 缺陷管理 | `manage-issue-discover` → `manage-issue` |
+| 文档同步 | 代码大改后 | `quality-sync` → `manage-codebase-refresh` |
+| 重构 | 技术债积累 | `quality-refactor` → `quality-review` |
+| 发布 | 里程碑结束 | `maestro-milestone-audit` → `maestro-milestone-release` |
+| 并行开发 | 多 milestone 并行 | `maestro-fork` → ... → `maestro-merge` |
+### Step 2.5: Semantic Match & Rank
+综合以下信号对路由表中的命令评分（高→低）：
+| 信号 | 权重 | 说明 |
+|------|------|------|
+| intent 命中路由表行的关键词 | 高 | 字面匹配主依据 |
+| **lifecycle_position 的"自然下一步"** | **高** | 当 intent 含"继续/下一步/next/接下来"或为空时，此信号上升为决定性 |
+| `name` 关键词命中 intent | 中 | 如 intent 含 "test" → quality-test/quality-auto-test 加分 |
+| Workflow 簇匹配 | 中 | intent 涉及学习/知识/issue 等场景时触发对应簇 |
+| Recent activity 反向避免 | 低 | 刚完成的 stage 在短期内降权 |
+**特殊意图处理：**
+| Intent 模式 | 处理 |
+|------------|------|
+| 空 / "继续" / "下一步" / "next" / "接下来怎么走" | 直接按 lifecycle_position 的"自然下一步"作为 top pick |
+| "什么状态" / "现在到哪了" / "status" | top pick = `manage-status` |
+| 字面命中路由表关键词 | 路由表优先，lifecycle 作为加分项 |
+| 无任何匹配 | top pick = lifecycle 自然下一步 + W002 警告 |
+**意图 → 命令路由表**（候选池等于本表 + 上方"自然下一步"建议）：
+| 意图关键词 | 推荐命令 |
+|-----------|---------|
+| 头脑风暴 / 探索 / brainstorm / ideate | `maestro-brainstorm` |
+| 规格 / 正式文档 / spec-generate / blueprint | `maestro-blueprint` |
+| 分析 / analyze / 多维度调研 | `maestro-analyze` |
+| 规划 / plan / 任务分解 | `maestro-plan` |
+| 实现 / 执行 / execute | `maestro-execute` |
+| 验证 / verify / 验收 | `maestro-verify` |
+| 调试 / debug / 排查 / bug | `quality-debug` |
+| 审查 / review / 代码审查 | `quality-review` |
+| 测试 / test / UAT | `quality-test` / `quality-auto-test` |
+| 重构 / refactor / 技术债 | `quality-refactor` |
+| 同步文档 / sync docs | `quality-sync` |
+| 回顾 / retro | `quality-retrospective` / `learn-retro` |
+| issue / 缺陷管理 | `manage-issue` / `manage-issue-discover` |
+| wiki / 知识图谱 | `manage-wiki` / `wiki-connect` / `wiki-digest` |
+| spec / 规则 / 约束 | `spec-load` / `spec-add` / `spec-setup` |
+| 项目初始化 / init | `maestro-init` |
+| 状态 / status / 仪表盘 | `manage-status` |
+| 文档重建 / codebase 文档 | `manage-codebase-rebuild` / `manage-codebase-refresh` |
+| 安全 / security / OWASP | `security-audit` |
+| 跟读 / 学习 / 阅读源码 | `learn-follow` / `learn-investigate` |
+| 第二意见 / challenge / consult | `learn-second-opinion` |
+| 提取知识 / harvest | `manage-harvest` / `manage-knowhow-capture` |
+| 设计 / UI / 前端打磨 | `maestro-impeccable` |
+| 里程碑 / milestone | `maestro-milestone-audit` / `maestro-milestone-release` / `maestro-milestone-complete` |
+| fork / 分支 / 并行开发 | `maestro-fork` / `maestro-merge` |
+| 覆盖层 / overlay / amend | `maestro-overlay` / `maestro-amend` |
+输出 ranked candidates，取 top N（默认 3）。
+### Step 3: Present & Confirm
+**`--list` 模式：** 按类别（maestro / manage / quality / learn / spec / wiki / security）分组展示所有候选 + description，结束。
+**正常模式：**
+显示：
+```
+🎯 推荐 (top pick): /<command-name>
+   <description>
+   推荐理由: <一句话说明为什么命中>
+备选:
+  2. /<alt-1> — <description>
+  3. /<alt-2> — <description>
+执行参数: <args-to-pass>
+```
+- `--dry-run` → 显示后结束
+- `-y` → 直接执行 top pick
+- 否则 → `AskUserQuestion` 让用户：执行 top pick / 选备选 / 修改参数 / 取消
+### Step 4: Execute
+通过 `Skill({ skill: "<chosen-command-name>", args: "<args>" })` 执行。
+**参数传递：**
+- 默认把 intent 原文作为第一个参数传给目标命令
+- 若用户在 Step 3 修改了参数，使用修改后的版本
+- `-y` flag 透传给目标命令（如果目标命令支持）
+执行完成后显示：
+```
+✅ 已执行 /<command-name>
+```
+不创建 session、不写 status.json、不做后续 chain — 由目标命令自行管理其产出。
+</execution>
+<error_codes>
+| Code | Severity | Condition | Recovery |
+|------|----------|-----------|----------|
+| E001 | error | 未提供 intent 且 clarification 后仍为空 | 提供意图描述或使用 `--list` 浏览命令池 |
+| E002 | error | 候选池为空（commands 目录不存在或无 .md 文件） | 检查 `.claude/commands/` 是否存在 |
+| E003 | error | 用户选择的命令名无法解析为有效 skill | 列出有效命令名让用户重选 |
+| W001 | warning | 多个命令得分接近（top1 与 top2 差距 < 阈值） | 强制展示前 3，让用户裁决 |
+| W002 | warning | intent 与所有候选匹配度均低 | 提示用户考虑 `/maestro` 或 `/maestro-ralph` 走管线 |
+</error_codes>
+<success_criteria>
+- [ ] Intent 解析 + flags 提取完成
+- [ ] 读取 `.workflow/state.json` + scratch artifacts 推断 lifecycle_position
+- [ ] 候选池等于路由表（管线编排器自然不在表中）
+- [ ] 评分综合：intent 字面匹配 + lifecycle 自然下一步 + workflow 簇 + recent activity
+- [ ] 空 intent / "继续" / "下一步" → 直接采用 lifecycle 推断的下一步
+- [ ] top pick 展示时附"推荐理由"（命中规则 + lifecycle 位置）
+- [ ] `--dry-run` 仅展示，不执行
+- [ ] `-y` 自动执行 top pick
+- [ ] 非自动模式下，用户通过 AskUserQuestion 确认或选择备选
+- [ ] 选定命令通过 `Skill()` 单次调用执行
+- [ ] 不创建 session、不生成 status.json、不触发后续 chain
+- [ ] `--list` 模式按 workflow 簇（主线 / Learning / Knowledge / Wiki / Issue / 文档 / 重构 / 发布 / 并行）分组展示
+</success_criteria>

package/.claude/commands/maestro.md CHANGED Viewed

@@ -23,7 +23,7 @@ Entry points:
 - **`/maestro --dry-run "intent"`** — Show chain, no execution
 - **`/maestro --super "intent"`** — Production-ready mode (read maestro-super.md)
-Session: `.workflow/.maestro/{session_id}/status.json`
+**Session**: `.workflow/.maestro/{session_id}/status.json` — 工作流唯一真源。session_id 格式 `maestro-{YYYYMMDD-HHmmss}`（本 command 创建，静态链）或 `ralph-{YYYYMMDD-HHmmss}`（`/maestro-ralph` 创建，自适应链）。两类都由 `/maestro-ralph-execute` 推进；schema 与 ralph 共用（含 `ralph_protocol_version: "1"` + `active_step_index`）。
 </purpose>
 <deferred_reading>

package/.codex/skills/learn-decompose/SKILL.md CHANGED Viewed

@@ -54,7 +54,7 @@ Resolve target to file list. Load coding specs: `maestro spec load --category co
 ### Phase 2: Wave 1 — Parallel Dimension Scans
-Generate `tasks.csv` with 4 dimension rows (wave 1) + 1 cross-ref row (wave 2):
+Generate `tasks.csv` with 4 dimension rows (wave 1) + 1 cross-ref row (wave 2). Initialize every row with `status="pending"`. Filter `wave==N AND status=="pending"` when writing each wave CSV.
 | id | dimension | focus |
 |----|-----------|-------|
@@ -64,7 +64,38 @@ Generate `tasks.csv` with 4 dimension rows (wave 1) + 1 cross-ref row (wave 2):
 | 4 | error | Boundaries, retry/backoff, fallbacks, guards, logging |
 | 5 | cross-ref | Dedup + catalog from wave 1 findings |
-Each dimension agent returns:
+**output_schema** (both waves):
+```json
+{
+  "type": "object",
+  "properties": {
+    "id":            { "type": "string" },
+    "result_status": { "type": "string", "enum": ["completed", "failed"] },
+    "dimension":     { "type": "string", "enum": ["structural", "behavioral", "data", "error", "cross-ref"] },
+    "patterns":      { "type": "string", "description": "JSON array string: [{name, dimension, confidence, anchors, description, rationale, tradeoffs}]" },
+    "findings":      { "type": "string", "maxLength": 500 },
+    "error":         { "type": "string" }
+  },
+  "required": ["id", "result_status", "findings"]
+}
+```
+Merge: `result_status` → master `status`; copy `dimension`, `patterns`, `findings`, `error`.
+**Shared termination contract** (embed in every instruction):
+```
+You MUST call report_agent_job_result EXACTLY ONCE before exiting.
+- Success → result_status=completed (patterns may be empty array if nothing found)
+- Failure → result_status=failed with error message
+- Timeout → near max_runtime_seconds → result_status=completed with partial patterns
+- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
+- Every finding MUST include file:line anchors. No speculation.
+- Read-only analysis. Do NOT modify source.
+Do NOT write to tasks.csv, wave-*.csv, results.csv. Do NOT call spawn_agents_on_csv (no recursion).
+```
+Each dimension agent populates `patterns` as a JSON array string of:
 ```json
 [{
   "name": "pattern name",
@@ -79,7 +110,7 @@ Each dimension agent returns:
 ### Phase 3: Wave 2 — Cross-Reference + Catalog
-Single agent receives all wave 1 findings via `prev_context`. Tasks:
+Single agent receives all wave 1 findings via `prev_context`. Uses same `output_schema` + termination contract above. Tasks:
 - Match against dedup set → mark as `documented`, `known`, or `new`
 - Merge duplicates across dimensions (same pattern found by multiple agents)
 - Flag contradictions with documented conventions

package/.codex/skills/learn-retro/SKILL.md CHANGED Viewed

@@ -44,7 +44,7 @@ $ARGUMENTS — lens selection and scope flags.
 **3a: Collect decisions** from wiki, specs, git log, phase context, .workflow/specs/learnings.md.
 **3b: Build decision registry** per decision (id, title, source, rationale, alternatives, evidence).
-**3c: Multi-perspective evaluation** via spawn_agents_on_csv (3 parallel agents):
+**3c: Multi-perspective evaluation** via spawn_agents_on_csv (3 parallel agents; filter `wave==1 AND status=="pending"`):
 | id | perspective | focus |
 |----|------------|-------|
@@ -52,6 +52,36 @@ $ARGUMENTS — lens selection and scope flags.
 | 2 | cost | Complexity added, coupling, tech debt. Grade: low-cost/acceptable/expensive |
 | 3 | hindsight | Right call with current knowledge? Grade: confirmed/questionable/should-revisit |
+**output_schema**:
+```json
+{
+  "type": "object",
+  "properties": {
+    "id":            { "type": "string" },
+    "result_status": { "type": "string", "enum": ["completed", "failed"] },
+    "perspective":   { "type": "string", "enum": ["technical", "cost", "hindsight"] },
+    "grade":         { "type": "string" },
+    "findings":      { "type": "string", "maxLength": 500 },
+    "error":         { "type": "string" }
+  },
+  "required": ["id", "result_status", "grade", "findings"]
+}
+```
+Merge: `result_status` → master `status`; copy `perspective`, `grade`, `findings`, `error`.
+**Shared termination contract** (embed in every instruction):
+```
+You MUST call report_agent_job_result EXACTLY ONCE before exiting.
+- Success → result_status=completed with concrete grade
+- Failure → result_status=failed with error message
+- Timeout → near max_runtime_seconds → result_status=failed, error="timeout (partial)"
+- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
+- Read-only analysis. Do NOT modify source files.
+Do NOT write to tasks.csv, wave-*.csv, results.csv. Do NOT call spawn_agents_on_csv (no recursion).
+```
 **3d: Classify lifecycle**: Validated / Aging / Questionable / Stale / Reversed.
 ### Phase 4: Unified Report

package/.codex/skills/learn-second-opinion/SKILL.md CHANGED Viewed

@@ -47,12 +47,42 @@ Resolve target to content. Load specs, wiki search, prior lessons for context br
 | 3 | strategist | Scalability, extensibility, architecture alignment | coupling, cohesion |
 | 4 | synthesis | Merge verdicts → agreements, disagreements, top 3 recommendations | combined verdict |
-Wave 1: 3 persona agents in parallel. Wave 2: synthesis agent with wave 1 findings as prev_context.
-Each persona returns: `{ persona, verdict: approve|concern|reject, confidence, findings: [{severity, description, location, suggestion}], summary }`
+Wave 1: 3 persona agents in parallel (filter `wave==1 AND status=="pending"`). Wave 2: synthesis agent with wave 1 findings as prev_context.
+**output_schema** (both waves):
+```json
+{
+  "type": "object",
+  "properties": {
+    "id":            { "type": "string" },
+    "result_status": { "type": "string", "enum": ["completed", "failed"] },
+    "persona":       { "type": "string" },
+    "verdict":       { "type": "string", "enum": ["approve", "concern", "reject", ""] },
+    "confidence":    { "type": "string", "description": "0-100" },
+    "findings":      { "type": "string", "description": "JSON array of {severity, description, location, suggestion}, max 500 chars summary" },
+    "summary":       { "type": "string", "maxLength": 500 },
+    "error":         { "type": "string" }
+  },
+  "required": ["id", "result_status", "summary"]
+}
+```
+Merge: `result_status` → master `status`; copy `persona`, `verdict`, `confidence`, `findings`, `summary`, `error`.
+**Shared termination contract** (embed in every instruction):
+```
+You MUST call report_agent_job_result EXACTLY ONCE before exiting.
+- Success → result_status=completed with concrete verdict
+- Failure → result_status=failed with error message
+- Timeout → near max_runtime_seconds → result_status=failed, error="timeout (partial)"
+- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
+- Read-only analysis. Do NOT modify source files.
+Do NOT write to tasks.csv, wave-*.csv, results.csv. Do NOT call spawn_agents_on_csv (no recursion).
+```
 #### Challenge Mode
-Single agent via spawn_agents_on_csv (1 worker). Adversarial analysis with forcing questions:
+Single agent via spawn_agents_on_csv (max_concurrency: 1) with the same `output_schema` + termination contract above. Adversarial analysis with forcing questions:
 - "What assumption would invalidate this entire approach?"
 - "What's the simplest thing that breaks this?"
 - "What's the implicit contract that isn't enforced?"

package/.codex/skills/maestro-analyze/SKILL.md CHANGED Viewed

@@ -152,29 +152,63 @@ S_AGGREGATE:
 <actions>
+### Shared Spawn Contract (all three waves)
+Every `spawn_agents_on_csv` call in this skill MUST use the strict JSON Schema below and the shared termination contract.
+**Output Schema**:
+```json
+{
+  "type": "object",
+  "properties": {
+    "id":            { "type": "string" },
+    "result_status": { "type": "string", "enum": ["completed", "failed", "blocked"] },
+    "findings":      { "type": "string", "maxLength": 500 },
+    "score":         { "type": "string", "description": "0-100 (wave 2 scoring only)" },
+    "evidence":      { "type": "string", "description": "Code refs file:line (wave 1/2)" },
+    "error":         { "type": "string" }
+  },
+  "required": ["id", "result_status", "findings"]
+}
+```
+Merge step: `result_status` → master `status`; copy `findings`, `score`, `evidence`, `error`.
+**Termination contract** (embed in every instruction):
+```
+You MUST call report_agent_job_result EXACTLY ONCE before exiting.
+- Success → result_status=completed
+- Failure → result_status=failed with error message
+- Blocked → upstream missing → result_status=blocked
+- Timeout → near max_runtime_seconds → result_status=blocked, error="timeout"
+- NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
+Do NOT write to tasks.csv, wave-*.csv, results.csv. Do NOT call spawn_agents_on_csv (no recursion).
+```
 ### A_SPAWN_WAVE_1
-Filter wave==1 -> write wave-1.csv -> `spawn_agents_on_csv({ csv_path, max_concurrency })`.
+Filter `wave==1 AND status=="pending"` -> write wave-1.csv -> `spawn_agents_on_csv({ csv_path, id_column:"id", instruction: EXPLORATION_INSTRUCTION + SHARED_TERMINATION_CONTRACT, max_concurrency, max_runtime_seconds: 3600, output_csv_path, output_schema })`.
 **Exploration agent** (3-layer per dimension):
 1. Module Discovery (breadth): keyword search, relevant files, module boundaries
 2. Structure Tracing (depth): top 3-5 files, call chains 2-3 levels, data flow
 3. Code Anchor Extraction (detail): code snippet 20-50 lines with file:line per finding
-Share via discovery board. Merge results -> master tasks.csv.
+Share via discovery board. Merge results -> master tasks.csv (map `result_status` → master `status`).
 ### A_SPAWN_WAVE_2
-Filter wave==2 -> build prev_context from wave 1 findings -> write wave-2.csv -> spawn.
+Filter `wave==2 AND status=="pending"` -> build prev_context from wave 1 findings -> write wave-2.csv -> spawn with `SCORING_INSTRUCTION + SHARED_TERMINATION_CONTRACT`.
 **Scoring agent** (6 dimensions: feasibility, impact, risk, complexity, alignment, maintainability):
 Score 0-100 with specific evidence (code refs from exploration). Each score MUST reference exploration findings.
-Merge results -> master tasks.csv.
+Merge results -> master tasks.csv (map `result_status` → master `status`).
 ### A_SPAWN_WAVE_3
-Filter wave==3 -> build prev_context from wave 2 scores (or project context for quick mode) -> spawn.
+Filter `wave==3 AND status=="pending"` -> build prev_context from wave 2 scores (or project context for quick mode) -> spawn with `SYNTHESIS_INSTRUCTION + SHARED_TERMINATION_CONTRACT`.
 **Synthesis agent**:
 - Full mode: analysis.md (executive summary, per-dimension scores, risk matrix, Go/No-Go), context.md (Locked/Free/Deferred decisions), context-package.json, conclusions.json (with `scope_verdict` + `implementation_scope[]`)
@@ -262,5 +296,10 @@ Protocol: read before analysis, append-only, dedup by type+key.
 - [ ] Upstream context loaded via `--from` when specified
 - [ ] discoveries.ndjson append-only throughout
 - [ ] Next step routed (plan for Go, brainstorm for No-Go, plan --gaps for Gaps)
+- [ ] Session sealed via finish-work (archive.json written, optional spec/knowhow extraction)
 </success_criteria>
+<on_complete>
+@~/.maestro/workflows/finish-work.md — SESSION_DIR=OUTPUT_DIR, SESSION_TYPE=analyze, SESSION_ID={artifact_id}, LINKED_MILESTONE={target_milestone or null}
+</on_complete>
 </output>

package/.codex/skills/maestro-blueprint/SKILL.md CHANGED Viewed

@@ -120,4 +120,9 @@ P6 gate: Pass (>=80%) → Handoff | Review (60-79%) → Handoff w/caveats | Fail
 - [ ] Readiness gate: Pass (>=80%) or Review (>=60%) with documented caveats
 - [ ] Artifact registered in state.json (type=blueprint)
 - [ ] context-package.json generated for downstream consumption
+- [ ] On gate Pass/Review: session sealed via finish-work (archive.json + optional spec/knowhow extraction). On Fail: skip — session stays active, excluded from wiki search.
 </success_criteria>
+<on_complete>
+@~/.maestro/workflows/finish-work.md — SESSION_DIR={session_dir}, SESSION_TYPE=blueprint, SESSION_ID={session_id}, LINKED_MILESTONE=null
+</on_complete>

package/.codex/skills/maestro-brainstorm/SKILL.md CHANGED Viewed

@@ -212,6 +212,47 @@ CONSTRAINTS:
 7. **DO NOT STOP**: Continuous until all waves complete; only pause at [CHECKPOINT] (skipped with -y).
 </invariants>
+<spawn_contract>
+All three waves invoke `spawn_agents_on_csv` with the same shape — only `instruction` (inflated from `<agent_prompt_template>`) and `max_concurrency` differ. The orchestrator MUST:
+1. Filter master tasks.csv to `wave==N AND status=="pending"` before writing `wave-{N}.csv`.
+2. Use the strict JSON Schema below for `output_schema`.
+3. Append the shared termination contract to every inflated `description`.
+4. Merge: map `result_status` → master `status`; copy `findings`, `output_path`, `error`.
+**output_schema** (all waves):
+```json
+{
+  "type": "object",
+  "properties": {
+    "id":            { "type": "string" },
+    "result_status": { "type": "string", "enum": ["completed", "failed", "blocked"] },
+    "findings":      { "type": "string", "maxLength": 500 },
+    "output_path":   { "type": "string", "description": "Primary deliverable absolute path (W1: guidance-specification.md; W2: {role}/analysis.md; W3: review-findings.json)" },
+    "error":         { "type": "string" }
+  },
+  "required": ["id", "result_status", "findings"]
+}
+```
+**Shared termination contract** (append to every inflated W1/W2/W3 description):
+```
+TERMINATION CONTRACT (mandatory — NO worker may end without calling report_agent_job_result):
+  - Success path → all required files written AND verified via Glob → result_status=completed, output_path set
+  - Failure path → unrecoverable error (write fail, missing input file) → result_status=failed
+  - Blocked path → upstream missing (W2 cannot read guidance-spec; W3 cannot read analysis.md) → result_status=blocked
+  - Timeout path → near max_runtime_seconds → finalize current write if safe → otherwise report blocked with error="timeout"
+  - NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
+  - NEVER return analysis as chat text — files on disk are the ONLY valid deliverable.
+  - Do NOT write to tasks.csv, wave-*.csv, results.csv.
+  - Do NOT call spawn_agents_on_csv (no recursion).
+```
+</spawn_contract>
 <state_machine>
 <states>
@@ -387,4 +428,9 @@ Protocol: read before analysis, append-only, dedup by type+key.
 - [ ] context-package.json generated with per-item `ref` traceability
 - [ ] discoveries.ndjson append-only throughout
 - [ ] context.md aggregates session results with next-step routing
+- [ ] Session sealed via finish-work (auto mode only)
 </success_criteria>
+<on_complete>
+@~/.maestro/workflows/finish-work.md — SESSION_DIR={output_dir}, SESSION_TYPE=brainstorm, SESSION_ID={artifact_id}, LINKED_MILESTONE=null
+</on_complete>

package/.codex/skills/maestro-execute/SKILL.md CHANGED Viewed

@@ -257,16 +257,72 @@ For each wave N in ascending order:
 ```javascript
 spawn_agents_on_csv({
-  csv_path: `${sessionFolder}/wave-${N}.csv`,
+  csv_path: `${sessionFolder}/wave-${N}.csv`,    // only rows where wave==N AND status=="pending"
   id_column: "id",
-  instruction: buildExecutorInstruction(sessionFolder, phaseDir, autoCommit, specsContent),  // agent: ~/.codex/agents/workflow-executor.toml
-  max_concurrency: maxConcurrency, max_runtime_seconds: 3600,
+  instruction: EXECUTOR_INSTRUCTION,              // see "Executor Worker Contract" below
+  max_concurrency: maxConcurrency,
+  max_runtime_seconds: 3600,
   output_csv_path: `${sessionFolder}/wave-${N}-results.csv`,
-  output_schema: { id, result_status: [completed|failed|blocked], findings, files_modified, tests_passed, error }
+  output_schema: {
+    type: "object",
+    properties: {
+      id:             { type: "string" },
+      result_status:  { type: "string", enum: ["completed", "failed", "blocked"] },
+      findings:       { type: "string", maxLength: 500 },
+      files_modified: { type: "string", description: "Semicolon-separated paths" },
+      tests_passed:   { type: "string", enum: ["true", "false", "n/a"] },
+      error:          { type: "string" }
+    },
+    required: ["id", "result_status", "findings"]
+  }
 })
 ```
-4. Merge results into master `tasks.csv`: map `result_status` from `wave-{N}-results.csv` to the `status` column in master CSV. Delete `wave-{N}.csv` AND `wave-{N}-results.csv` after merge.
+4. Merge results into master `tasks.csv`: map `result_status` from `wave-{N}-results.csv` to the `status` column in master CSV; copy `findings`, `files_modified`, `tests_passed`, `error`. Delete `wave-{N}.csv` AND `wave-{N}-results.csv` after merge.
+#### Executor Worker Contract (EXECUTOR_INSTRUCTION)
+The literal `instruction` string passed to `spawn_agents_on_csv` MUST include the following contract (substitute `{sessionFolder}`, `{phaseDir}`, `{autoCommit}`, `{specsContent}` at build time):
+```
+You are a task executor. ONE task row is assigned to you.
+INPUT (from your CSV row):
+  - id, title, description, prev_context (findings from upstream tasks)
+  - meta.tdd_phase (red|green|refactor) if TDD mode is enabled
+REQUIRED STEPS:
+  1. Read prev_context — depend on upstream findings, not memory
+  2. Read shared discoveries: {sessionFolder}/discoveries.ndjson
+  3. Implement the task: edit/create files per description
+  4. Run verification — relevant tests; if TDD, honor tdd_phase semantics
+  5. If autoCommit and task succeeded → commit changes with task ID in message
+  6. Append discoveries (type=implementation_note / pattern) to discoveries.ndjson
+  7. Call report_agent_job_result EXACTLY ONCE
+TERMINATION CONTRACT (mandatory — NO worker may end without calling report_agent_job_result):
+  - Success path → all files written, tests pass → result_status=completed, tests_passed="true"
+  - Blocked path → cannot proceed (missing dep, unclear requirement, contract violation) → result_status=blocked with error explaining what is needed
+  - Failure path → unrecoverable error (build error, file write fail) → result_status=failed with error message
+  - Timeout path → approaching max_runtime_seconds → revert partial work, report blocked with error="timeout"
+  - NEVER continue indefinitely. NEVER exit silently. NEVER omit the call.
+OUTPUT (return via report_agent_job_result; must match output_schema):
+  {
+    "id": "<your row id>",
+    "result_status": "completed" | "failed" | "blocked",
+    "findings": "<one-sentence summary, max 500 chars>",
+    "files_modified": "<semicolon-separated paths or empty>",
+    "tests_passed": "true" | "false" | "n/a",
+    "error": "<message if not completed, else empty>"
+  }
+CONSTRAINTS:
+  - Modify ONLY files implicated by the task description and prev_context.
+  - Do NOT write to tasks.csv, wave-*.csv, results.csv, plan.json, or state.json — orchestrator owns those.
+  - Do NOT call spawn_agents_on_csv (no recursion).
+  - Honor specs loaded by orchestrator (passed via instruction context).
+```
 #### Blocked Task Handling