npm - maestro-flow - Versions diffs - 0.4.1 → 0.4.3 - Mend

maestro-flow 0.4.1 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (115) hide show

package/.claude/commands/maestro-analyze.md +1 -1
package/.claude/commands/maestro-brainstorm.md +1 -1
package/.claude/commands/maestro-collab.md +1 -1
package/.claude/commands/maestro-execute.md +10 -1
package/.claude/commands/maestro-guard.md +101 -0
package/.claude/commands/maestro-impeccable.md +77 -74
package/.claude/commands/maestro-plan.md +15 -2
package/.claude/commands/maestro-ralph-execute.md +9 -2
package/.claude/commands/maestro-ralph.md +8 -1
package/.claude/commands/maestro-verify.md +15 -1
package/.claude/commands/quality-auto-test.md +1 -1
package/.claude/commands/quality-debug.md +1 -1
package/.claude/commands/quality-refactor.md +1 -1
package/.claude/commands/quality-retrospective.md +1 -1
package/.claude/commands/quality-review.md +15 -1
package/.claude/commands/quality-test.md +1 -1
package/.claude/commands/security-audit.md +154 -0
package/.claude/skills/maestro-help/index/catalog.json +2 -0
package/.codex/skills/maestro-analyze/SKILL.md +18 -1
package/.codex/skills/maestro-brainstorm/SKILL.md +17 -4
package/.codex/skills/maestro-collab/SKILL.md +7 -1
package/.codex/skills/maestro-execute/SKILL.md +365 -348
package/.codex/skills/maestro-guard/SKILL.md +97 -0
package/.codex/skills/maestro-impeccable/SKILL.md +76 -73
package/.codex/skills/maestro-plan/SKILL.md +66 -7
package/.codex/skills/maestro-ralph/SKILL.md +1 -1
package/.codex/skills/maestro-verify/SKILL.md +18 -1
package/.codex/skills/quality-auto-test/SKILL.md +13 -3
package/.codex/skills/quality-debug/SKILL.md +362 -346
package/.codex/skills/quality-refactor/SKILL.md +1 -1
package/.codex/skills/quality-retrospective/SKILL.md +292 -292
package/.codex/skills/quality-review/SKILL.md +374 -365
package/.codex/skills/quality-test/SKILL.md +1 -1
package/.codex/skills/security-audit/SKILL.md +154 -0
package/bin/maestro-hook-runner.js +21 -1
package/dashboard/dist-server/src/coordinator/output-parser.js +27 -0
package/dashboard/dist-server/src/coordinator/output-parser.js.map +1 -1
package/dist/src/commands/coordinate.d.ts.map +1 -1
package/dist/src/commands/coordinate.js +2 -0
package/dist/src/commands/coordinate.js.map +1 -1
package/dist/src/commands/hooks.d.ts +49 -0
package/dist/src/commands/hooks.d.ts.map +1 -1
package/dist/src/commands/hooks.js +236 -33
package/dist/src/commands/hooks.js.map +1 -1
package/dist/src/commands/install-backend.d.ts +2 -0
package/dist/src/commands/install-backend.d.ts.map +1 -1
package/dist/src/commands/install-backend.js +72 -0
package/dist/src/commands/install-backend.js.map +1 -1
package/dist/src/commands/install.d.ts.map +1 -1
package/dist/src/commands/install.js +15 -2
package/dist/src/commands/install.js.map +1 -1
package/dist/src/coordinator/output-parser.d.ts.map +1 -1
package/dist/src/coordinator/output-parser.js +27 -0
package/dist/src/coordinator/output-parser.js.map +1 -1
package/dist/src/hooks/delegate-monitor.d.ts +1 -0
package/dist/src/hooks/delegate-monitor.d.ts.map +1 -1
package/dist/src/hooks/delegate-monitor.js +1 -1
package/dist/src/hooks/delegate-monitor.js.map +1 -1
package/dist/src/hooks/guards/workflow-guard.d.ts +15 -0
package/dist/src/hooks/guards/workflow-guard.d.ts.map +1 -1
package/dist/src/hooks/guards/workflow-guard.js +61 -1
package/dist/src/hooks/guards/workflow-guard.js.map +1 -1
package/dist/src/hooks/plugins/decision-log-plugin.d.ts +19 -0
package/dist/src/hooks/plugins/decision-log-plugin.d.ts.map +1 -0
package/dist/src/hooks/plugins/decision-log-plugin.js +28 -0
package/dist/src/hooks/plugins/decision-log-plugin.js.map +1 -0
package/dist/src/hooks/plugins/index.d.ts +2 -0
package/dist/src/hooks/plugins/index.d.ts.map +1 -1
package/dist/src/hooks/plugins/index.js +1 -0
package/dist/src/hooks/plugins/index.js.map +1 -1
package/dist/src/hooks/session-context.d.ts +1 -0
package/dist/src/hooks/session-context.d.ts.map +1 -1
package/dist/src/hooks/session-context.js +1 -1
package/dist/src/hooks/session-context.js.map +1 -1
package/dist/src/hooks/skill-context.d.ts +1 -0
package/dist/src/hooks/skill-context.d.ts.map +1 -1
package/dist/src/hooks/skill-context.js +1 -1
package/dist/src/hooks/skill-context.js.map +1 -1
package/dist/src/hooks/spec-injector.d.ts.map +1 -1
package/dist/src/hooks/spec-injector.js +2 -0
package/dist/src/hooks/spec-injector.js.map +1 -1
package/dist/src/i18n/locales/en.d.ts.map +1 -1
package/dist/src/i18n/locales/en.js +13 -0
package/dist/src/i18n/locales/en.js.map +1 -1
package/dist/src/i18n/locales/zh.d.ts.map +1 -1
package/dist/src/i18n/locales/zh.js +13 -0
package/dist/src/i18n/locales/zh.js.map +1 -1
package/dist/src/i18n/types.d.ts +7 -0
package/dist/src/i18n/types.d.ts.map +1 -1
package/dist/src/tui/install-ui/InstallConfirm.d.ts +5 -0
package/dist/src/tui/install-ui/InstallConfirm.d.ts.map +1 -1
package/dist/src/tui/install-ui/InstallConfirm.js +1 -1
package/dist/src/tui/install-ui/InstallConfirm.js.map +1 -1
package/dist/src/tui/install-ui/InstallExecution.d.ts +2 -0
package/dist/src/tui/install-ui/InstallExecution.d.ts.map +1 -1
package/dist/src/tui/install-ui/InstallExecution.js +22 -3
package/dist/src/tui/install-ui/InstallExecution.js.map +1 -1
package/dist/src/tui/install-ui/InstallFlow.d.ts +1 -1
package/dist/src/tui/install-ui/InstallFlow.d.ts.map +1 -1
package/dist/src/tui/install-ui/InstallFlow.js +25 -4
package/dist/src/tui/install-ui/InstallFlow.js.map +1 -1
package/dist/src/tui/install-ui/InstallHub.d.ts +5 -0
package/dist/src/tui/install-ui/InstallHub.d.ts.map +1 -1
package/dist/src/tui/install-ui/InstallHub.js +16 -0
package/dist/src/tui/install-ui/InstallHub.js.map +1 -1
package/dist/src/tui/install-ui/InstallResult.d.ts.map +1 -1
package/dist/src/tui/install-ui/InstallResult.js +1 -1
package/dist/src/tui/install-ui/InstallResult.js.map +1 -1
package/package.json +1 -1
package/workflows/debug.md +73 -0
package/workflows/execute.md +27 -0
package/workflows/plan.md +11 -0
package/workflows/review.md +33 -1
package/workflows/tdd.md +257 -0
package/workflows/verify.md +57 -0

package/.codex/skills/maestro-guard/SKILL.md ADDED Viewed

@@ -0,0 +1,97 @@
+---
+name: maestro-guard
+description: Manage editing boundary restrictions
+argument-hint: "<on|off|status|allow <path>|deny <path>>"
+allowed-tools: Read, Write, Bash, Glob
+---
+<purpose>
+Configure directory-level write boundaries enforced by the workflow-guard PreToolUse hook.
+When enabled, Write and Edit tool calls targeting files outside allowed paths are blocked.
+Subcommands:
+- **on** -- Enable path guard (defaults to `src/` if no paths configured)
+- **off** -- Disable path guard (preserves path list)
+- **status** -- Show current guard configuration
+- **allow `<path>`** -- Add a directory to the allowed paths list
+- **deny `<path>`** -- Switch to deny mode and add path to deny list
+</purpose>
+<context>
+$ARGUMENTS -- Parse subcommand and optional path argument.
+**Config location:** `.workflow/config.json` -> `guard` section
+```json
+{
+  "guard": {
+    "enabled": false,
+    "mode": "allow",
+    "paths": []
+  }
+}
+```
+**Enforcement:** The `workflow-guard` hook (PreToolUse on Write/Edit) reads this config
+and blocks operations targeting files outside boundaries. Requires hooks level >= `full`.
+</context>
+<execution>
+**Step 1: Parse subcommand**
+Extract from $ARGUMENTS:
+- `on` / `off` / `status` / `allow <path>` / `deny <path>`
+- If no subcommand, default to `status`
+**Step 2: Read config**
+Read `.workflow/config.json`. If file missing, initialize with empty guard section.
+**Step 3: Execute subcommand**
+**`status`:**
+- Display: enabled/disabled, mode (allow/deny), paths list
+- Check if workflow-guard hook is active (read `.codex/settings.json` for hook presence)
+- If guard enabled but hook not active, warn: "WARNING: PathGuard enabled but workflow-guard hook not installed. Run `maestro hooks level full` to activate."
+**`on`:**
+- Set `guard.enabled = true`
+- If `guard.paths` is empty, set default: `["src/", "tests/", ".workflow/"]`
+- Check hook level, warn if < full
+- Write config
+**`off`:**
+- Set `guard.enabled = false`
+- Preserve existing paths and mode
+- Write config
+**`allow <path>`:**
+- Normalize path to forward slashes, ensure trailing slash for directories
+- If `guard.mode` is `deny`, switch to `allow` and clear paths with warning
+- Add path to `guard.paths` (deduplicate)
+- Set `guard.enabled = true` if not already
+- Write config
+**`deny <path>`:**
+- Normalize path to forward slashes
+- Set `guard.mode = "deny"`
+- Add path to `guard.paths` (deduplicate)
+- Set `guard.enabled = true` if not already
+- Write config
+**Step 4: Confirm**
+Display updated guard configuration.
+</execution>
+<error_codes>
+- E001: `.workflow/config.json` not found and cannot be created (not a maestro project)
+- W001: PathGuard enabled but workflow-guard hook not installed
+</error_codes>
+<success_criteria>
+- [ ] Config read/written correctly
+- [ ] Hook level warning displayed when applicable
+- [ ] Updated configuration shown after changes
+</success_criteria>

package/.codex/skills/maestro-impeccable/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-impeccable
-description: Production-grade UI design — 24 commands + chain orchestration with quality gates + design search
+description: Use when designing, auditing, polishing, or improving frontend UI — websites, dashboards, landing pages, components
 argument-hint: "<command|chain|intent> [target] [flags]"
 allowed-tools: Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---
@@ -70,14 +70,14 @@ responsive-design.md, spatial-design.md, typography.md, ux-writing.md
 | Chain | Steps | Scenario |
 |-------|-------|----------|
-| build | teach? → explore? → shape → craft → critique → [refine] → audit → polish | 从零新建 |
-| redesign | document → explore → shape → craft → critique → [refine] → audit → polish | 基于现有代码重设计 |
-| improve | critique → [refine] → polish → audit | 迭代改进 |
-| enhance | {cmd...} → critique → [refine] → polish | 定向增强（支持多命令） |
-| launch | harden → adapt → optimize → audit → polish | 全方位上线准备 |
-| harden | harden → audit → polish | 边界加固 |
-| foundation | teach? → explore → document → extract | 纯设计系统建设 |
-| live | live | 实时迭代 |
+| build | teach? → explore? → shape → craft → critique → [refine] → audit → polish | New from scratch |
+| redesign | document → explore → shape → craft → critique → [refine] → audit → polish | Redesign existing code |
+| improve | critique → [refine] → polish → audit | Iterative improvement |
+| enhance | {cmd...} → critique → [refine] → polish | Targeted enhancement (multi-command) |
+| launch | harden → adapt → optimize → audit → polish | Full production readiness |
+| harden | harden → audit → polish | Edge case hardening |
+| foundation | teach? → explore → document → extract | Design system setup |
+| live | live | Real-time iteration |
 - `?` = conditional: teach if PRODUCT.md missing; explore if DESIGN.md missing and --skip-design not set
 - `[refine]` = quality gate loop: gate fails → auto-select fix commands from findings → re-gate
@@ -85,64 +85,67 @@ responsive-design.md, spatial-design.md, typography.md, ux-writing.md
 ## Free Text Routing
-Three-layer priority matching. Stop on first match.
+Three-layer priority matching. Stop on first match — do not continue to lower layers.
-### Layer 1: Intent matches single command → Direct
+### Layer 1: Single command intent → Direct
-Match user description against Command Routing descriptions. Route to the closest single command.
+Semantically match user description against the Command Routing table's Description column. Match the closest **single** command.
+**Skip condition**: If the prompt also contains a Layer 2 chain keyword AND does not focus on a single design dimension, skip this layer.
+Example: `enhance colors and typography` — "enhance" is a chain keyword + multiple design dimensions → skip to Layer 2.
 | Intent signal | Command |
 |---------------|---------|
-| review, UX check, heuristic, 评审, 评分 | critique |
-| a11y, audit, accessibility, performance audit, 技术检查 | audit |
-| animation, motion, transitions, 动效, 加动画 | animate |
-| color, palette, contrast, OKLCH, 配色, 颜色 | colorize |
-| typography, font, type scale, 字体, 排版 | typeset |
-| layout, spacing, grid, alignment, 布局, 间距 | layout |
-| tone down, too loud, 太花, 视觉噪音 | quieter |
-| too bland, bolder, more personality, 太平淡 | bolder |
-| simplify, strip, too complex, cognitive load, 太复杂 | distill |
-| polish, micro-adjust, pixel perfect, 打磨 | polish |
-| copy, labels, error messages, UX writing, 文案 | clarify |
-| responsive, mobile, breakpoints, 适配 | adapt |
-| performance, loading, bundle, jank, 性能 | optimize |
-| edge cases, error states, i18n, overflow, 边界 | harden |
-| onboarding, first-run, empty state, 引导 | onboard |
-| delight, personality, joy, memorable, 趣味 | delight |
-| extraordinary, push limits, 炫酷, 极限 | overdrive |
-| plan UX, wireframe, information architecture, 规划 | shape |
-| variants, compare styles, multi-style, 多风格 | explore |
-| PRODUCT.md, brand definition, 品牌定义 | teach |
-| DESIGN.md, design documentation, 设计文档 | document |
-| pull tokens, extract components, 提取组件 | extract |
-| browser iteration, 实时迭代 | live |
-### Layer 2: Concrete build task → Direct craft
-Layer 1 missed, but intent is "build/create specific thing":
-- Has specific file path or target
-- Has detailed visual specs (layout, style, palette)
-- Has reference material
-→ Route to **craft** (Direct)
-### Layer 3: Project intent → Chain
-Layer 1+2 missed, broad project direction:
+| review, check UX, score, heuristic, evaluate usability | critique |
+| audit, a11y, accessibility, technical check, performance audit, code quality | audit |
+| add animation, motion, transitions, micro-interactions | animate |
+| color, palette, OKLCH, contrast, color scheme | colorize |
+| font, typography, type scale, line height, font pairing | typeset |
+| layout, spacing, grid, alignment, visual hierarchy | layout |
+| too loud, tone down, visual noise, make it simpler, too busy | quieter |
+| too bland, bolder, more personality, stronger, more contrast | bolder |
+| too complex, simplify, strip, remove clutter, cognitive load | distill |
+| polish, fine-tune, pixel perfect, final pass, refine details | polish |
+| copy, labels, error messages, UX writing, microcopy, CTAs | clarify |
+| responsive, mobile, adapt, breakpoints, touch targets | adapt |
+| performance, loading, bundle, jank, speed, rendering | optimize |
+| edge cases, error states, i18n, overflow, empty state hardening | harden |
+| onboarding, first-run, empty state, activation, progressive disclosure | onboard |
+| fun, surprise, personality, memorable, joy, delight | delight |
+| extraordinary, push limits, ambitious effects, cutting-edge | overdrive |
+| plan UX, wireframe, information architecture, visual direction | shape |
+| multi-style, variants, compare styles, style comparison | explore |
+| brand definition, PRODUCT.md, product context | teach |
+| extract design, DESIGN.md, document design system | document |
+| pull tokens, extract components, design system extraction | extract |
+| real-time, browser iteration, live editing | live |
+### Layer 2: Project intent → Chain
+Layer 1 did not match. Check for chain-level keywords — even if the prompt also contains a specific target/path, chain matching takes priority.
 | Pattern | Chain |
 |---------|-------|
-| create, build, new, from scratch | build |
-| redesign, rethink, restyle | redesign |
-| improve, iterate, better | improve |
-| enhance, visual upgrade | enhance |
-| launch, deploy, ship, production-ready | launch |
-| harden, production, edge cases | harden |
-| design system, tokens, design spec | foundation |
-| live, browser | live |
+| new, create, build, from scratch, start fresh | build |
+| redo, redesign, rethink, restyle, overhaul, revamp | redesign |
+| improve, iterate, better, refine overall | improve |
+| enhance, visual upgrade, level up | enhance |
+| launch, deploy, ship, production-ready, go live | launch |
+| harden, production-harden, edge cases | harden |
+| design system, tokens, design foundation, design infrastructure | foundation |
+| real-time, live, browser | live |
 Ambiguous + no `-y` → `request_user_input`.
+### Layer 3: Concrete build task → Direct craft
+Layer 1+2 both did not match, but intent is to build/create a specific thing:
+- Contains a specific file path or target (`d:\path`, `src/pages/`, `index.html`)
+- Contains detailed visual specs (layout, style, color scheme)
+- Contains reference material (`based on...`, `like...`, `similar to...`)
+→ Route to **craft** (Direct)
 <invariants>
 1. Prerequisites before any design work — never skip context loading or register detection
 2. Read workflow file before execution — never execute a command without loading its .md
@@ -161,23 +164,23 @@ Before reading any command workflow:
 ## Direct Execution
 1. Prerequisites ✓
-2. **显示执行信息**：
+2. **Display execution info**:
    ```
    ── Command: {command} ────────────────────
    Category: {category} | Target: {target}
    ─────────────────────────────────────────
    ```
 3. Read `~/.maestro/workflows/impeccable/{command}.md`
-4. **TodoWrite 跟踪**：按 workflow 文件中的主要阶段创建 todo 项
-   - 格式：`[{command}] {phase description}`
-   - 每个阶段完成后立即标记 completed
+4. **Progress tracking**: create todo items for each major phase in the workflow file
+   - Format: `[{command}] {phase description}`
+   - Mark each phase completed immediately upon finishing
 5. Follow workflow file instructions
 6. Post: suggest logical next command (teach→shape, shape→craft, craft→critique, etc.)
 ## Chain Execution
 1. Prerequisites ✓
-2. **显示执行链**：解析 chain 定义，输出完整步骤预览：
+2. **Display chain preview**: parse chain definition, output full step preview:
    ```
    ── Chain: build ──────────────────────────
     1. teach        (conditional: PRODUCT.md missing)
@@ -191,31 +194,31 @@ Before reading any command workflow:
    ─────────────────────────────────────────
    Target: {target}
    ```
-   - `◆` 标记 quality gate 步骤，显示阈值
-   - `↺` 标记 refine loop，显示最大循环次数
-   - conditional 步骤注明触发条件
-   - 跳过的 conditional 步骤标记 `(skipped)`
+   - `◆` marks quality gate steps with threshold
+   - `↺` marks refine loop with max iteration count
+   - Conditional steps show trigger condition
+   - Skipped conditional steps marked `(skipped)`
 3. Create session: `.workflow/.maestro/ui-craft-{YYYYMMDD-HHmmss}/status.json`
    ```json
    { "chain_type": "...", "target": "...", "steps": [...], "current_step": 0,
      "gate_history": [], "loop_count": 0, "status": "running" }
    ```
-4. **TodoWrite 初始化**：为 chain 所有步骤创建 todo 项
-   - 每步一项，格式：`[chain] step N: {command} — {description}`
-   - conditional 步骤若跳过，立即标记 completed
-   - quality gate 步骤标注阈值：`[chain] step 5: critique ◆ gate ≥26/40`
+4. **Init tracking**: create todo items for all chain steps
+   - One item per step, format: `[chain] step N: {command} — {description}`
+   - If conditional step is skipped, immediately mark completed
+   - Quality gate steps include threshold: `[chain] step 5: critique ◆ gate ≥26/40`
 5. For each step:
    - Read `~/.maestro/workflows/impeccable/{command}.md` → execute
-   - **步骤开始**：TodoWrite 标记当前步骤 in_progress
-   - **步骤完成**：TodoWrite 标记 completed + update status.json (`current_step`, step `status`)
-   - **步骤失败**：TodoWrite 标记 completed(with note) + 记录原因
+   - **Step start**: mark current step in_progress
+   - **Step done**: mark completed + update status.json (`current_step`, step `status`)
+   - **Step failed**: mark completed (with note) + record reason
 6. **Quality gate** (critique/audit steps):
    - Parse score: critique `**Total** | | **N/40**`, audit `**Total** | | **N/20**`
    - Count `[P0]` / `[P1]` tags
    - Pass: score ≥ threshold AND P0 == 0 → advance
    - Fail: collect suggested commands from findings → execute → re-gate
    - Max loops exceeded → force advance with warning
-   - TodoWrite：gate 结果记入当前步骤备注（score, P0/P1 count, pass/fail）
+   - Record gate result in current step notes (score, P0/P1 count, pass/fail)
 7. Final report: scores + trend + commands executed
 ## Resume

package/.codex/skills/maestro-plan/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-plan
-description: Plan phase execution with exploration and verification
+description: Use when creating, revising, or verifying an execution plan for a phase or task
 argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] \"<phase> [--dir <path>] [--gaps] [--spec SPEC-xxx] [--collab]\""
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---
@@ -8,13 +8,64 @@ allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request
 <purpose>
 Wave-based planning via `spawn_agents_on_csv`. Wave 1 explores codebase in parallel across multiple angles, Wave 2 generates verified execution plan consuming all exploration findings.
-Supports: Create (default), Revise (`--revise`), Check (`--check`), Gaps (`--gaps`).
+Supports: Create (default), Revise (`--revise`), Check (`--check`), Gaps (`--gaps`), TDD (`--tdd`).
 </purpose>
+<tdd_mode>
+## TDD Mode (`--tdd`)
+When `--tdd` is active, the planning agent in Wave 2 decomposes each behavior into RED-GREEN-REFACTOR triplets.
+### Iron Law
+**NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.** Write code before the test? Delete it. Start over.
+### Task Chain Structure
+For each behavior B:
+- **TASK-{N}a (RED)**: Write failing test. Verify it FAILS (not errors). type=test, tdd_phase=red.
+- **TASK-{N}b (GREEN)**: Write minimal code to pass. Verify ALL tests pass. type=feature, tdd_phase=green, depends_on=[TASK-{N}a].
+- **TASK-{N}c (REFACTOR)**: Clean up. Keep tests green. No new behavior. type=refactor, tdd_phase=refactor, depends_on=[TASK-{N}b]. Skip if GREEN code already clean.
+### Wave Assignment
+```
+Wave 1: TASK-1a, TASK-2a (RED — parallel if independent)
+Wave 2: TASK-1b, TASK-2b (GREEN — parallel)
+Wave 3: TASK-1c, TASK-2c (REFACTOR — parallel)
+```
+Within a group: `{N}a → {N}b → {N}c` (strict dependency).
+### plan.json Output
+```json
+{ "tdd_mode": true, "tdd_groups": [{ "group": 1, "behavior": "...", "tasks": ["TASK-1a","TASK-1b","TASK-1c"] }] }
+```
+Standard plan.json + .task/TASK-*.json — consumable by maestro-execute without modification.
+### Execution Enforcement
+- RED task: verify test exists AND fails. If passes → BLOCKED "wrong test".
+- GREEN task: verify ALL tests pass. If RED test still fails → BLOCKED.
+- REFACTOR task: verify ALL tests still pass. If fails → undo.
+### Red Flags — These Thoughts Mean STOP
+- "Too simple to need TDD" / "I'll write tests after" / "Let me explore first, then add tests"
+- "Tests after achieve the same goals" / "TDD will slow me down"
+All mean: **follow the cycle anyway**.
+### Rationalization Table
+| Excuse | Reality |
+|--------|---------|
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
+| "I'll test after" | Tests passing immediately prove nothing. |
+| "Need to explore first" | Fine. Throw away exploration, start fresh with TDD. |
+| "Test hard = design unclear" | Listen to the test. Hard to test = hard to use. |
+</tdd_mode>
 <context>
 $ARGUMENTS — phase number/text and optional flags.
-**Flags**: `-y` (auto), `-c N` (concurrency, default 4), `--continue` (resume), `--dir <path>`, `--gaps` (issue-linked), `--spec SPEC-xxx`, `--collab`, `--revise`, `--check`
+**Flags**: `-y` (auto), `-c N` (concurrency, default 4), `--continue` (resume), `--dir <path>`, `--gaps` (issue-linked), `--spec SPEC-xxx`, `--collab`, `--revise`, `--check`, `--tdd` (RED-GREEN-REFACTOR task chains)
 **Scope routing** (priority): --dir → from parent artifact; no args → milestone; digit → phase; text → adhoc/standalone.
@@ -68,7 +119,7 @@ S_RESUME → S_CHECK     WHEN: W2 done, check pending
 S_CONTEXT → S_CSV_GEN  DO: load context.md, conclusions.json, specs, wiki, codebase docs
-S_CSV_GEN → S_WAVE_1   DO: determine exploration angles, generate tasks.csv, user validates (skip -y)
+S_CSV_GEN → S_WAVE_1   DO: pre-flight (`maestro collab preflight --phase N`; exit 1 → warn + ask), determine exploration angles, generate tasks.csv, user validates (skip -y)
 S_WAVE_1 → S_WAVE_2    DO: spawn parallel explorations, merge results, build prev_context
@@ -130,7 +181,15 @@ Collision detection against same-milestone plans.
 <success_criteria>
 - [ ] Parallel explorations + sequential planning via spawn_agents_on_csv
-- [ ] plan.json + TASK files with read_first and grep-verifiable convergence criteria
-- [ ] Plan confidence scored, readiness gate checked, collision detected
-- [ ] PLN artifact registered, issues linked if --gaps
+- [ ] plan.json with summary, approach, task_ids, waves (with phase labels), confidence section
+- [ ] .task/TASK-*.json with read_first[] (file being modified + source of truth files)
+- [ ] Every task has convergence.criteria[] with grep-verifiable conditions (no subjective language)
+- [ ] Every task action and implementation contain concrete values (no "align X with Y")
+- [ ] Plan confidence scored with 5-dimension factor model
+- [ ] Readiness gate checked before collision detection
+- [ ] Pressure pass completed on highest-complexity task
+- [ ] Collision detection against same-milestone plans (non-blocking)
+- [ ] Plan-checker passed (or minor issues acknowledged, max 3 iterations)
+- [ ] PLN artifact registered in state.json
+- [ ] If --gaps: issues linked bidirectionally (task_refs[], task_plan_dir in issues.jsonl)
 </success_criteria>

package/.codex/skills/maestro-ralph/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-ralph
-description: Adaptive lifecycle engine -- infer state, build command chain
+description: Use when the optimal command sequence is unclear and needs automated state-based determination
 argument-hint: "\"intent\" [-y] | status | continue | execute"
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---

package/.codex/skills/maestro-verify/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-verify
-description: Verify goals with must-have checks and test coverage validation
+description: Use after execution to verify goals are actually achieved with evidence-based structural checks
 argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] \"<phase> [--skip-tests] [--skip-antipattern]\""
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---
@@ -10,6 +10,18 @@ Wave-based 3-layer Goal-Backward verification using `spawn_agents_on_csv`.
 Wave 1 (truth + artifact existence) -> Wave 2 (substance + wiring) -> Wave 3 (anti-pattern + Nyquist audit).
 **Core principle**: Task completion != Goal achievement. A task marked complete may contain stubs/placeholders. This verifier checks that goals are actually achieved.
+## Iron Law
+**NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE IN THIS MESSAGE.** Before any success claim: IDENTIFY what command proves it → RUN it fresh → READ full output → VERIFY it confirms the claim → ONLY THEN make the claim.
+## Forbidden Wording
+BANNED: "Should work now", "Probably passes", "Seems correct", "Looks good", "I'm confident that...", any satisfaction BEFORE running verification. Replace with evidence: `"Tests pass: 42/42 green (exit 0)"`.
+## Red Flags — These Thoughts Mean STOP
+- "I just wrote this code, it definitely works" / "The changes are too small to break anything"
+- "I already verified this earlier" / "The agent said it's done"
+All mean: **run verification command NOW, read output, then report**.
 </purpose>
 <context>
@@ -182,12 +194,17 @@ Protocol: read before analysis, append-only, dedup by type+key.
 </error_codes>
 <success_criteria>
+- [ ] Must-haves established from convergence.criteria + success_criteria + derived behaviors
 - [ ] All 3 waves executed (with skip flags respected)
 - [ ] verification.json + context.md produced
 - [ ] validation.json produced (if Nyquist ran)
 - [ ] Fix plans generated for gap clusters
 - [ ] Issues auto-created for gaps + blocker anti-patterns
+- [ ] Post-verify knowledge inquiry triggered when applicable
 - [ ] Phase index.json updated with verification status
+- [ ] VRF artifact registered in state.json
+- [ ] Gap-fix closure loop documented: gaps → plan --gaps → execute → verify (re-run)
+- [ ] Next step routed (quality-review if passed, plan --gaps if gaps, quality-auto-test if low coverage)
 - [ ] discoveries.ndjson append-only throughout
 </success_criteria>
 </output>

package/.codex/skills/quality-auto-test/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: quality-auto-test
-description: Auto-generate and run tests from specs or coverage gaps
+description: Use when test coverage needs automated expansion or existing tests need iterative convergence
 argument-hint: "<phase> [-y] [-c N] [--max-iter N] [--layer L0-L3] [--dry-run] [--re-run]"
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
 ---
@@ -94,11 +94,15 @@ S_PARSE:
   -> S_SOURCE       DO: resolve phase dir, detect route (resume/re-run/spec/gap/code)
 S_SOURCE:
-  -> S_INFRA        DO: extract scenarios per route, normalize to unified format
+  -> S_INFRA        DO: extract scenarios per route, normalize to unified format, integrate quality artifacts
   Route A (spec): Parse REQ-*.md acceptance criteria, classify layers, generate fixtures
   Route B (gap): Read verification/coverage gaps, classify files by type
   Route C (code): Explore module boundaries, API endpoints, integration points
+  **Cross-artifact integration** (all routes, after primary extraction):
+  - **Review findings**: Query state.json for type=review artifacts on same phase. Extract critical/high findings → additional scenarios marked `source: "review_finding"`. If review verdict=="BLOCK" and these tests fail, suggest quality-debug.
+  - **Debug root causes**: Query state.json for type=debug artifacts on same phase. Generate regression test scenarios from confirmed root causes → marked `source: "debug_root_cause"`.
 S_INFRA:
   -> S_CSV_GEN      DO: detect framework, read 2-3 existing tests, build infrastructure_hints
@@ -244,12 +248,18 @@ Protocol: read before writing tests, append-only, dedup by type+key.
 <success_criteria>
 - [ ] Route auto-selected from project state (spec/gap/code)
+- [ ] Review findings and debug root causes integrated as additional test scenarios
 - [ ] Layers executed in order with fail-fast on critical
 - [ ] Test writing + diagnosis parallelized via spawn_agents_on_csv
 - [ ] Cross-layer context propagation via prev_context
 - [ ] Iteration engine: inner test_defect fix, outer strategy adjust
 - [ ] Test confidence scored per iteration (5-dimension model)
+- [ ] Convergence check includes confidence >= 60% alongside pass_rate threshold
+- [ ] Pressure pass completed on highest-pass-rate layer before completion
 - [ ] state.json, report.json, reflection-log.md written
-- [ ] If spec: traceability.md; if failures: issues auto-created
+- [ ] TST artifact registered in state.json
+- [ ] If spec: traceability.md written; if failures: issues auto-created in issues.jsonl
+- [ ] If gap source: validation.json gaps updated (MISSING→COVERED)
+- [ ] Next step routed (converged → verify, bugs → debug, >80% → quality-test, <80% → debug)
 </success_criteria>
 </output>