npm - maestro-flow - Versions diffs - 0.4.2 → 0.4.3 - Mend

maestro-flow 0.4.2 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (80) hide show

package/.claude/commands/maestro-analyze.md +1 -1
package/.claude/commands/maestro-brainstorm.md +1 -1
package/.claude/commands/maestro-collab.md +1 -1
package/.claude/commands/maestro-execute.md +10 -1
package/.claude/commands/maestro-guard.md +101 -0
package/.claude/commands/maestro-impeccable.md +1 -1
package/.claude/commands/maestro-plan.md +15 -2
package/.claude/commands/maestro-ralph-execute.md +9 -2
package/.claude/commands/maestro-ralph.md +8 -1
package/.claude/commands/maestro-verify.md +15 -1
package/.claude/commands/quality-auto-test.md +1 -1
package/.claude/commands/quality-debug.md +1 -1
package/.claude/commands/quality-refactor.md +1 -1
package/.claude/commands/quality-retrospective.md +1 -1
package/.claude/commands/quality-review.md +15 -1
package/.claude/commands/quality-test.md +1 -1
package/.claude/commands/security-audit.md +154 -0
package/.claude/skills/maestro-help/index/catalog.json +2 -0
package/.codex/skills/maestro-analyze/SKILL.md +18 -1
package/.codex/skills/maestro-brainstorm/SKILL.md +17 -4
package/.codex/skills/maestro-collab/SKILL.md +7 -1
package/.codex/skills/maestro-execute/SKILL.md +365 -348
package/.codex/skills/maestro-guard/SKILL.md +97 -0
package/.codex/skills/maestro-impeccable/SKILL.md +1 -1
package/.codex/skills/maestro-plan/SKILL.md +66 -7
package/.codex/skills/maestro-ralph/SKILL.md +1 -1
package/.codex/skills/maestro-verify/SKILL.md +18 -1
package/.codex/skills/quality-auto-test/SKILL.md +13 -3
package/.codex/skills/quality-debug/SKILL.md +362 -346
package/.codex/skills/quality-refactor/SKILL.md +1 -1
package/.codex/skills/quality-retrospective/SKILL.md +292 -292
package/.codex/skills/quality-review/SKILL.md +374 -365
package/.codex/skills/quality-test/SKILL.md +1 -1
package/.codex/skills/security-audit/SKILL.md +154 -0
package/bin/maestro-hook-runner.js +21 -1
package/dashboard/dist-server/src/coordinator/output-parser.js +27 -0
package/dashboard/dist-server/src/coordinator/output-parser.js.map +1 -1
package/dist/src/commands/coordinate.d.ts.map +1 -1
package/dist/src/commands/coordinate.js +2 -0
package/dist/src/commands/coordinate.js.map +1 -1
package/dist/src/commands/hooks.d.ts.map +1 -1
package/dist/src/commands/hooks.js +39 -3
package/dist/src/commands/hooks.js.map +1 -1
package/dist/src/coordinator/output-parser.d.ts.map +1 -1
package/dist/src/coordinator/output-parser.js +27 -0
package/dist/src/coordinator/output-parser.js.map +1 -1
package/dist/src/hooks/delegate-monitor.d.ts +1 -0
package/dist/src/hooks/delegate-monitor.d.ts.map +1 -1
package/dist/src/hooks/delegate-monitor.js +1 -1
package/dist/src/hooks/delegate-monitor.js.map +1 -1
package/dist/src/hooks/guards/workflow-guard.d.ts +15 -0
package/dist/src/hooks/guards/workflow-guard.d.ts.map +1 -1
package/dist/src/hooks/guards/workflow-guard.js +61 -1
package/dist/src/hooks/guards/workflow-guard.js.map +1 -1
package/dist/src/hooks/plugins/decision-log-plugin.d.ts +19 -0
package/dist/src/hooks/plugins/decision-log-plugin.d.ts.map +1 -0
package/dist/src/hooks/plugins/decision-log-plugin.js +28 -0
package/dist/src/hooks/plugins/decision-log-plugin.js.map +1 -0
package/dist/src/hooks/plugins/index.d.ts +2 -0
package/dist/src/hooks/plugins/index.d.ts.map +1 -1
package/dist/src/hooks/plugins/index.js +1 -0
package/dist/src/hooks/plugins/index.js.map +1 -1
package/dist/src/hooks/session-context.d.ts +1 -0
package/dist/src/hooks/session-context.d.ts.map +1 -1
package/dist/src/hooks/session-context.js +1 -1
package/dist/src/hooks/session-context.js.map +1 -1
package/dist/src/hooks/skill-context.d.ts +1 -0
package/dist/src/hooks/skill-context.d.ts.map +1 -1
package/dist/src/hooks/skill-context.js +1 -1
package/dist/src/hooks/skill-context.js.map +1 -1
package/dist/src/hooks/spec-injector.d.ts.map +1 -1
package/dist/src/hooks/spec-injector.js +2 -0
package/dist/src/hooks/spec-injector.js.map +1 -1
package/package.json +1 -1
package/workflows/debug.md +73 -0
package/workflows/execute.md +27 -0
package/workflows/plan.md +11 -0
package/workflows/review.md +33 -1
package/workflows/tdd.md +257 -0
package/workflows/verify.md +57 -0

package/.codex/skills/maestro-guard/SKILL.md ADDED Viewed

@@ -0,0 +1,97 @@
+---
+name: maestro-guard
+description: Manage editing boundary restrictions
+argument-hint: "<on|off|status|allow <path>|deny <path>>"
+allowed-tools: Read, Write, Bash, Glob
+---
+<purpose>
+Configure directory-level write boundaries enforced by the workflow-guard PreToolUse hook.
+When enabled, Write and Edit tool calls targeting files outside allowed paths are blocked.
+Subcommands:
+- **on** -- Enable path guard (defaults to `src/` if no paths configured)
+- **off** -- Disable path guard (preserves path list)
+- **status** -- Show current guard configuration
+- **allow `<path>`** -- Add a directory to the allowed paths list
+- **deny `<path>`** -- Switch to deny mode and add path to deny list
+</purpose>
+<context>
+$ARGUMENTS -- Parse subcommand and optional path argument.
+**Config location:** `.workflow/config.json` -> `guard` section
+```json
+{
+  "guard": {
+    "enabled": false,
+    "mode": "allow",
+    "paths": []
+  }
+}
+```
+**Enforcement:** The `workflow-guard` hook (PreToolUse on Write/Edit) reads this config
+and blocks operations targeting files outside boundaries. Requires hooks level >= `full`.
+</context>
+<execution>
+**Step 1: Parse subcommand**
+Extract from $ARGUMENTS:
+- `on` / `off` / `status` / `allow <path>` / `deny <path>`
+- If no subcommand, default to `status`
+**Step 2: Read config**
+Read `.workflow/config.json`. If file missing, initialize with empty guard section.
+**Step 3: Execute subcommand**
+**`status`:**
+- Display: enabled/disabled, mode (allow/deny), paths list
+- Check if workflow-guard hook is active (read `.codex/settings.json` for hook presence)
+- If guard enabled but hook not active, warn: "WARNING: PathGuard enabled but workflow-guard hook not installed. Run `maestro hooks level full` to activate."
+**`on`:**
+- Set `guard.enabled = true`
+- If `guard.paths` is empty, set default: `["src/", "tests/", ".workflow/"]`
+- Check hook level, warn if < full
+- Write config
+**`off`:**
+- Set `guard.enabled = false`
+- Preserve existing paths and mode
+- Write config
+**`allow <path>`:**
+- Normalize path to forward slashes, ensure trailing slash for directories
+- If `guard.mode` is `deny`, switch to `allow` and clear paths with warning
+- Add path to `guard.paths` (deduplicate)
+- Set `guard.enabled = true` if not already
+- Write config
+**`deny <path>`:**
+- Normalize path to forward slashes
+- Set `guard.mode = "deny"`
+- Add path to `guard.paths` (deduplicate)
+- Set `guard.enabled = true` if not already
+- Write config
+**Step 4: Confirm**
+Display updated guard configuration.
+</execution>
+<error_codes>
+- E001: `.workflow/config.json` not found and cannot be created (not a maestro project)
+- W001: PathGuard enabled but workflow-guard hook not installed
+</error_codes>
+<success_criteria>
+- [ ] Config read/written correctly
+- [ ] Hook level warning displayed when applicable
+- [ ] Updated configuration shown after changes
+</success_criteria>

package/.codex/skills/maestro-impeccable/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-impeccable
-description: Production-grade UI design — 24 commands + chain orchestration with quality gates + design search
+description: Use when designing, auditing, polishing, or improving frontend UI — websites, dashboards, landing pages, components
 argument-hint: "<command|chain|intent> [target] [flags]"
 allowed-tools: Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---

package/.codex/skills/maestro-plan/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-plan
-description: Plan phase execution with exploration and verification
+description: Use when creating, revising, or verifying an execution plan for a phase or task
 argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] \"<phase> [--dir <path>] [--gaps] [--spec SPEC-xxx] [--collab]\""
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---
@@ -8,13 +8,64 @@ allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request
 <purpose>
 Wave-based planning via `spawn_agents_on_csv`. Wave 1 explores codebase in parallel across multiple angles, Wave 2 generates verified execution plan consuming all exploration findings.
-Supports: Create (default), Revise (`--revise`), Check (`--check`), Gaps (`--gaps`).
+Supports: Create (default), Revise (`--revise`), Check (`--check`), Gaps (`--gaps`), TDD (`--tdd`).
 </purpose>
+<tdd_mode>
+## TDD Mode (`--tdd`)
+When `--tdd` is active, the planning agent in Wave 2 decomposes each behavior into RED-GREEN-REFACTOR triplets.
+### Iron Law
+**NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.** Write code before the test? Delete it. Start over.
+### Task Chain Structure
+For each behavior B:
+- **TASK-{N}a (RED)**: Write failing test. Verify it FAILS (not errors). type=test, tdd_phase=red.
+- **TASK-{N}b (GREEN)**: Write minimal code to pass. Verify ALL tests pass. type=feature, tdd_phase=green, depends_on=[TASK-{N}a].
+- **TASK-{N}c (REFACTOR)**: Clean up. Keep tests green. No new behavior. type=refactor, tdd_phase=refactor, depends_on=[TASK-{N}b]. Skip if GREEN code already clean.
+### Wave Assignment
+```
+Wave 1: TASK-1a, TASK-2a (RED — parallel if independent)
+Wave 2: TASK-1b, TASK-2b (GREEN — parallel)
+Wave 3: TASK-1c, TASK-2c (REFACTOR — parallel)
+```
+Within a group: `{N}a → {N}b → {N}c` (strict dependency).
+### plan.json Output
+```json
+{ "tdd_mode": true, "tdd_groups": [{ "group": 1, "behavior": "...", "tasks": ["TASK-1a","TASK-1b","TASK-1c"] }] }
+```
+Standard plan.json + .task/TASK-*.json — consumable by maestro-execute without modification.
+### Execution Enforcement
+- RED task: verify test exists AND fails. If passes → BLOCKED "wrong test".
+- GREEN task: verify ALL tests pass. If RED test still fails → BLOCKED.
+- REFACTOR task: verify ALL tests still pass. If fails → undo.
+### Red Flags — These Thoughts Mean STOP
+- "Too simple to need TDD" / "I'll write tests after" / "Let me explore first, then add tests"
+- "Tests after achieve the same goals" / "TDD will slow me down"
+All mean: **follow the cycle anyway**.
+### Rationalization Table
+| Excuse | Reality |
+|--------|---------|
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
+| "I'll test after" | Tests passing immediately prove nothing. |
+| "Need to explore first" | Fine. Throw away exploration, start fresh with TDD. |
+| "Test hard = design unclear" | Listen to the test. Hard to test = hard to use. |
+</tdd_mode>
 <context>
 $ARGUMENTS — phase number/text and optional flags.
-**Flags**: `-y` (auto), `-c N` (concurrency, default 4), `--continue` (resume), `--dir <path>`, `--gaps` (issue-linked), `--spec SPEC-xxx`, `--collab`, `--revise`, `--check`
+**Flags**: `-y` (auto), `-c N` (concurrency, default 4), `--continue` (resume), `--dir <path>`, `--gaps` (issue-linked), `--spec SPEC-xxx`, `--collab`, `--revise`, `--check`, `--tdd` (RED-GREEN-REFACTOR task chains)
 **Scope routing** (priority): --dir → from parent artifact; no args → milestone; digit → phase; text → adhoc/standalone.
@@ -68,7 +119,7 @@ S_RESUME → S_CHECK     WHEN: W2 done, check pending
 S_CONTEXT → S_CSV_GEN  DO: load context.md, conclusions.json, specs, wiki, codebase docs
-S_CSV_GEN → S_WAVE_1   DO: determine exploration angles, generate tasks.csv, user validates (skip -y)
+S_CSV_GEN → S_WAVE_1   DO: pre-flight (`maestro collab preflight --phase N`; exit 1 → warn + ask), determine exploration angles, generate tasks.csv, user validates (skip -y)
 S_WAVE_1 → S_WAVE_2    DO: spawn parallel explorations, merge results, build prev_context
@@ -130,7 +181,15 @@ Collision detection against same-milestone plans.
 <success_criteria>
 - [ ] Parallel explorations + sequential planning via spawn_agents_on_csv
-- [ ] plan.json + TASK files with read_first and grep-verifiable convergence criteria
-- [ ] Plan confidence scored, readiness gate checked, collision detected
-- [ ] PLN artifact registered, issues linked if --gaps
+- [ ] plan.json with summary, approach, task_ids, waves (with phase labels), confidence section
+- [ ] .task/TASK-*.json with read_first[] (file being modified + source of truth files)
+- [ ] Every task has convergence.criteria[] with grep-verifiable conditions (no subjective language)
+- [ ] Every task action and implementation contain concrete values (no "align X with Y")
+- [ ] Plan confidence scored with 5-dimension factor model
+- [ ] Readiness gate checked before collision detection
+- [ ] Pressure pass completed on highest-complexity task
+- [ ] Collision detection against same-milestone plans (non-blocking)
+- [ ] Plan-checker passed (or minor issues acknowledged, max 3 iterations)
+- [ ] PLN artifact registered in state.json
+- [ ] If --gaps: issues linked bidirectionally (task_refs[], task_plan_dir in issues.jsonl)
 </success_criteria>

package/.codex/skills/maestro-ralph/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-ralph
-description: Adaptive lifecycle engine -- infer state, build command chain
+description: Use when the optimal command sequence is unclear and needs automated state-based determination
 argument-hint: "\"intent\" [-y] | status | continue | execute"
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---

package/.codex/skills/maestro-verify/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: maestro-verify
-description: Verify goals with must-have checks and test coverage validation
+description: Use after execution to verify goals are actually achieved with evidence-based structural checks
 argument-hint: "[-y|--yes] [-c|--concurrency N] [--continue] \"<phase> [--skip-tests] [--skip-antipattern]\""
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
 ---
@@ -10,6 +10,18 @@ Wave-based 3-layer Goal-Backward verification using `spawn_agents_on_csv`.
 Wave 1 (truth + artifact existence) -> Wave 2 (substance + wiring) -> Wave 3 (anti-pattern + Nyquist audit).
 **Core principle**: Task completion != Goal achievement. A task marked complete may contain stubs/placeholders. This verifier checks that goals are actually achieved.
+## Iron Law
+**NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE IN THIS MESSAGE.** Before any success claim: IDENTIFY what command proves it → RUN it fresh → READ full output → VERIFY it confirms the claim → ONLY THEN make the claim.
+## Forbidden Wording
+BANNED: "Should work now", "Probably passes", "Seems correct", "Looks good", "I'm confident that...", any satisfaction BEFORE running verification. Replace with evidence: `"Tests pass: 42/42 green (exit 0)"`.
+## Red Flags — These Thoughts Mean STOP
+- "I just wrote this code, it definitely works" / "The changes are too small to break anything"
+- "I already verified this earlier" / "The agent said it's done"
+All mean: **run verification command NOW, read output, then report**.
 </purpose>
 <context>
@@ -182,12 +194,17 @@ Protocol: read before analysis, append-only, dedup by type+key.
 </error_codes>
 <success_criteria>
+- [ ] Must-haves established from convergence.criteria + success_criteria + derived behaviors
 - [ ] All 3 waves executed (with skip flags respected)
 - [ ] verification.json + context.md produced
 - [ ] validation.json produced (if Nyquist ran)
 - [ ] Fix plans generated for gap clusters
 - [ ] Issues auto-created for gaps + blocker anti-patterns
+- [ ] Post-verify knowledge inquiry triggered when applicable
 - [ ] Phase index.json updated with verification status
+- [ ] VRF artifact registered in state.json
+- [ ] Gap-fix closure loop documented: gaps → plan --gaps → execute → verify (re-run)
+- [ ] Next step routed (quality-review if passed, plan --gaps if gaps, quality-auto-test if low coverage)
 - [ ] discoveries.ndjson append-only throughout
 </success_criteria>
 </output>

package/.codex/skills/quality-auto-test/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: quality-auto-test
-description: Auto-generate and run tests from specs or coverage gaps
+description: Use when test coverage needs automated expansion or existing tests need iterative convergence
 argument-hint: "<phase> [-y] [-c N] [--max-iter N] [--layer L0-L3] [--dry-run] [--re-run]"
 allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
 ---
@@ -94,11 +94,15 @@ S_PARSE:
   -> S_SOURCE       DO: resolve phase dir, detect route (resume/re-run/spec/gap/code)
 S_SOURCE:
-  -> S_INFRA        DO: extract scenarios per route, normalize to unified format
+  -> S_INFRA        DO: extract scenarios per route, normalize to unified format, integrate quality artifacts
   Route A (spec): Parse REQ-*.md acceptance criteria, classify layers, generate fixtures
   Route B (gap): Read verification/coverage gaps, classify files by type
   Route C (code): Explore module boundaries, API endpoints, integration points
+  **Cross-artifact integration** (all routes, after primary extraction):
+  - **Review findings**: Query state.json for type=review artifacts on same phase. Extract critical/high findings → additional scenarios marked `source: "review_finding"`. If review verdict=="BLOCK" and these tests fail, suggest quality-debug.
+  - **Debug root causes**: Query state.json for type=debug artifacts on same phase. Generate regression test scenarios from confirmed root causes → marked `source: "debug_root_cause"`.
 S_INFRA:
   -> S_CSV_GEN      DO: detect framework, read 2-3 existing tests, build infrastructure_hints
@@ -244,12 +248,18 @@ Protocol: read before writing tests, append-only, dedup by type+key.
 <success_criteria>
 - [ ] Route auto-selected from project state (spec/gap/code)
+- [ ] Review findings and debug root causes integrated as additional test scenarios
 - [ ] Layers executed in order with fail-fast on critical
 - [ ] Test writing + diagnosis parallelized via spawn_agents_on_csv
 - [ ] Cross-layer context propagation via prev_context
 - [ ] Iteration engine: inner test_defect fix, outer strategy adjust
 - [ ] Test confidence scored per iteration (5-dimension model)
+- [ ] Convergence check includes confidence >= 60% alongside pass_rate threshold
+- [ ] Pressure pass completed on highest-pass-rate layer before completion
 - [ ] state.json, report.json, reflection-log.md written
-- [ ] If spec: traceability.md; if failures: issues auto-created
+- [ ] TST artifact registered in state.json
+- [ ] If spec: traceability.md written; if failures: issues auto-created in issues.jsonl
+- [ ] If gap source: validation.json gaps updated (MISSING→COVERED)
+- [ ] Next step routed (converged → verify, bugs → debug, >80% → quality-test, <80% → debug)
 </success_criteria>
 </output>