@exaudeus/workrail 1.15.0 → 1.15.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "1.15.0",
3
+ "version": "1.15.1",
4
4
  "description": "Step-by-step workflow enforcement for AI agents via MCP",
5
5
  "license": "MIT",
6
6
  "repository": {
@@ -41,7 +41,7 @@
41
41
  "DECISION LOG: Entry includes Decision, Why (1-3 bullets), Impacted files (≤5), User feedback, Surprises. Cap 8 bullets.",
42
42
  "VARIABLE TYPES: Strings: taskComplexity, rigorMode, prStrategy, selectedApproach, runnerUpApproach, leadingCandidate, architectureRationale, keyRiskToMonitor, selectedSliceStrategy.",
43
43
  "VARIABLE TYPES: Strings (cont): pivotSeverity (none/MINOR/MODERATE/MAJOR), pivotReturnPhase, cleanSlateDivergence (None/Minor/Major).",
44
- "VARIABLE TYPES: Arrays: approaches, pivotTriggers, preMortemFindings, sliceStrategies, planningGaps, integrationGaps, integrationVerificationFindings, invariantViolations. Numbers: planConfidence, sliceIndex.",
44
+ "VARIABLE TYPES: Arrays: approaches, pivotTriggers, preMortemFindings, sliceStrategies, planningGaps, integrationGaps, integrationVerificationFindings, invariantViolations, resolvedFindings. Numbers: planConfidence, sliceIndex.",
45
45
  "VARIABLE TYPES: Booleans: continuePlanning, pivotTriggered, planningComplete, validationFailed, slicePlanStale, majorConcernsRaised.",
46
46
  "VARIABLE TYPES: Booleans (cont): spikeFailure, assumptionsValidated, planningGapsFound, integrationGapsFound, integrationVerificationPassed, integrationVerificationFailed, regressionDetected."
47
47
  ],
@@ -311,19 +311,19 @@
311
311
  {
312
312
  "id": "phase-5a-draft-implementation-plan",
313
313
  "title": "Plan Artifact Draft/Update",
314
- "prompt": "Create or update the **Plan Artifact** (deterministic schema).\n\n**Write-or-paste rule:** attempt to write/update `implementation_plan.md`. If file writing fails, output full content in chat (canonical).\n\n**Plan Artifact headings (concise, complete):**\n\n1) Problem statement\n2) Acceptance criteria (bullets)\n3) Non-goals (bullets)\n4) **User rules/preferences applied:**\n - Relevant `userRules` + how plan respects them.\n - Deviations: rationale + mitigation + user decision (counts toward `maxQuestions`).\n5) Invariants (reference `invariants`)\n6) Proposed approach (1–2 paragraphs)\n7) Architecture decision (reference Phase 3/3b outputs):\n - Selected approach: reference `selectedApproach`\n - Rationale: reference `architectureRationale`\n - Runner-up (Plan B): reference `runnerUpApproach`\n - Key risk: reference `keyRiskToMonitor`\n - Full alternatives: see CONTEXT.md Approaches section\n8) **Vertical slices** (match `slices`: scope, done-definition, files, verification)\n\n **Work Packages inside each slice (mode-dependent):**\n - QUICK: skip work packages\n - STANDARD: optional; recommended when slice is high-risk or multi-layer\n - THOROUGH: required for non-trivial slices\n\n Each work package (WP):\n - ID: `S<sliceIndex>-WP<k>` (e.g., S1-WP1)\n - Goal: one coherent outcome\n - Targets (allowlist): dirs/files (+ allowed new files)\n - Forbidden (denylist): files/dirs not to touch\n - Budget: maxModified (5 STANDARD/8 THOROUGH), maxNew (2/3)\n - Done-definition: 2–5 bullets\n - Verification: 1–3 commands/tests\n - Dependencies: contracts/types from other WPs (if parallel)\n\n **Parallelism rule:** parallelize only if Targets don't overlap. Final WP must be \"Hook-up/Integration\" when parallel was used.\n\n9) Test plan (unit/integration/e2e; cite repo patterns)\n10) Risk register (risks + mitigation + rollback/flag)\n11) PR packaging (Single/Multi + rule)\n\n**Set context variables:**\n- `planArtifact`\n- `implementationPlan`\n\n**VERIFY:** concrete enough for another engineer to implement without guessing.",
314
+ "prompt": "Create or update the **Plan Artifact** (deterministic schema).\n\n**Write-or-paste rule:** attempt to write/update `implementation_plan.md`. If file writing fails, output full content in chat (canonical).\n\n**Plan Artifact headings (concise, complete):**\n\n1) Problem statement\n2) Acceptance criteria (bullets)\n3) Non-goals (bullets)\n4) **User rules/preferences applied:**\n - Relevant `userRules` + how plan respects them.\n - Deviations: rationale + mitigation + user decision (counts toward `maxQuestions`).\n5) Invariants (reference `invariants`)\n6) Proposed approach (1–2 paragraphs)\n7) Architecture decision (reference Phase 3/3b outputs):\n - Selected approach: reference `selectedApproach`\n - Rationale: reference `architectureRationale`\n - Runner-up (Plan B): reference `runnerUpApproach`\n - Key risk: reference `keyRiskToMonitor`\n - Full alternatives: see CONTEXT.md Approaches section\n8) **Vertical slices** (match `slices`: scope, done-definition, files, verification)\n\n **Work Packages inside each slice (mode-dependent):**\n - QUICK: skip work packages\n - STANDARD: optional; recommended when slice is high-risk or multi-layer\n - THOROUGH: required for non-trivial slices\n\n Each work package (WP):\n - ID: `S<sliceIndex>-WP<k>` (e.g., S1-WP1)\n - Goal: one coherent outcome\n - Targets (allowlist): dirs/files (+ allowed new files)\n - Forbidden (denylist): files/dirs not to touch\n - Budget: maxModified (5 STANDARD/8 THOROUGH), maxNew (2/3)\n - Done-definition: 2–5 bullets\n - Verification: 1–3 commands/tests\n - Dependencies: contracts/types from other WPs (if parallel)\n\n **Parallelism rule:** parallelize only if Targets don't overlap. Final WP must be \"Hook-up/Integration\" when parallel was used.\n\n9) Test plan (unit/integration/e2e; cite repo patterns)\n10) Risk register (risks + mitigation + rollback/flag)\n11) PR packaging (Single/Multi + rule)\n12) **Philosophy alignment per slice** (for each slice, include):\n - For each design principle touched by this slice: [principle] → [satisfied / tension / violated + 1-line why]\n - The audit step will independently verify these self-assessments. Be honest — violations caught early are cheaper than violations caught in review.\n\n**Set context variables:**\n- `planArtifact`\n- `implementationPlan`\n\n**VERIFY:** concrete enough for another engineer to implement without guessing.",
315
315
  "requireConfirmation": false
316
316
  },
317
317
  {
318
318
  "id": "phase-5b-plan-audit-mode-adaptive",
319
319
  "title": "Plan Audit (Subagent-Friendly)",
320
- "prompt": "**Mission: Find gaps, issues, and inconsistencies in this plan.**\n\nActively look for:\n- **Gaps**: What's missing? What's not covered?\n- **Weak assumptions**: What could be wrong? What are we taking for granted?\n- **Inconsistencies**: Do parts contradict each other? Does the plan match the invariants?\n- **Risks**: What could go wrong? What hasn't been stress-tested?\n\n---\n\n**Mode behavior:**\n- QUICK: self-audit only\n- STANDARD: self-audit; delegate once if subagents exist\n- THOROUGH: parallel delegation if subagents exist\n\n**If subagents + `rigorMode=THOROUGH`:**\n\nYou have permission to spawn THREE subagents SIMULTANEOUSLY for parallel plan validation.\n\nDelegate to WorkRail Executor THREE TIMES with scoped context:\n\n**Delegation 1 — Plan Analysis:**\n- routine: routine-plan-analysis\n- plan: implementation_plan.md\n- requirements: [From Phase 2 invariants + acceptance criteria]\n- constraints: [Filtered userRules: architecture, testing, patterns]\n- context (file-reference-first, max 500 words if pasting):\n - Read: CONTEXT.md (userRules section), implementation_plan.md\n - Read: spec.md, design.md (if exist)\n - Invariants + locks (if locksMatrix exists)\n - Feature brief: problem statement + architecture decision + key constraints\n- deliverable: plan-analysis.md\n\n**Delegation 2 — Hypothesis Challenge:**\n- routine: routine-hypothesis-challenge\n- rigor: 3\n- hypotheses: [Plan's key assumptions about architecture, dependencies, invariant satisfaction]\n- evidence: implementation_plan.md\n- context:\n - Read: implementation_plan.md\n - Filtered userRules: error handling, edge cases, validation rules\n - Invariants (especially high-risk ones)\n - Feature brief: problem + acceptance criteria + non-goals\n- deliverable: plan-challenges.md\n\n**Delegation 3 — Execution Simulation:**\n- routine: routine-execution-simulation\n- entry_point: [Riskiest slice entry function]\n- inputs: [Expected inputs and state]\n- trace_depth: 3 (follow calls to understand failure modes)\n- context:\n - Read: implementation_plan.md (riskiest slice section)\n - Filtered userRules: performance, data flow, state management\n - Invariants touched by risky slice\n - Feature brief: architecture decision + risk register\n- deliverable: simulation-results.md\n\n**Self-check before delegating (required):**\n✅ Each delegation includes filtered userRules (not full list)\n✅ Each includes invariants + locks (if applicable)\n✅ Each includes feature brief (file refs or <500 word excerpt)\n✅ Each has specific focus/lens\n\n**If subagents + `rigorMode=STANDARD`:**\nDelegate ONCE using Plan Analysis with full context (not filtered).\n\n\n**Note:** delegationMode was detected in phase-0c and cached in CONTEXT.md\n**Else:** self-audit (same three lenses).\n\n**Output:**\n- Findings: Critical / Major / Minor\n- Plan amendments\n\n---\n\n**CLEAN-SLATE CHECK (STANDARD+, if findings exist):**\n\nBefore applying amendments, briefly answer:\n\n> \"If I started fresh right now, knowing everything I've learned, would I choose the same approach?\"\n\n1. Without looking at current plan, sketch in 1 sentence what approach you'd take\n2. Compare to `selectedApproach`:\n - **Same**: Proceed with amendments\n - **Minor variation**: Note the insight; consider incorporating\n - **Fundamentally different**: STOP. Set `cleanSlateDivergence = Major`\n\n**If fundamentally different:**\n- Document why fresh thinking differs\n- Return to Phase 3b with fresh approach as new candidate, OR\n- Document why current approach is still better despite fresh thinking\n\n**Set:** `planFindings`, `planAmendments`, `planConfidence` (1–10), `cleanSlateDivergence` (None/Minor/Major)",
320
+ "prompt": "**Mission: Find gaps, issues, and inconsistencies in this plan.**\n\nActively look for:\n- **Gaps**: What's missing? What's not covered?\n- **Weak assumptions**: What could be wrong? What are we taking for granted?\n- **Inconsistencies**: Do parts contradict each other? Does the plan match the invariants?\n- **Risks**: What could go wrong? What hasn't been stress-tested?\n\n---\n\n**Mode behavior:**\n- QUICK: self-audit only\n- STANDARD: self-audit; delegate once if subagents exist\n- THOROUGH: parallel delegation if subagents exist\n\n**If subagents + `rigorMode=THOROUGH`:**\n\nYou have permission to spawn THREE subagents SIMULTANEOUSLY for parallel plan validation.\n\nDelegate to WorkRail Executor THREE TIMES with scoped context:\n\n**Delegation 1 — Plan Analysis:**\n- routine: routine-plan-analysis\n- plan: implementation_plan.md\n- requirements: [From Phase 2 invariants + acceptance criteria]\n- constraints: [Filtered userRules: architecture, testing, patterns]\n- context (file-reference-first, max 500 words if pasting):\n - Read: CONTEXT.md (userRules section), implementation_plan.md\n - Read: spec.md, design.md (if exist)\n - Invariants + locks (if locksMatrix exists)\n - Feature brief: problem statement + architecture decision + key constraints\n- deliverable: plan-analysis.md\n\n**Delegation 2 — Hypothesis Challenge:**\n- routine: routine-hypothesis-challenge\n- rigor: 3\n- hypotheses: [Plan's key assumptions about architecture, dependencies, invariant satisfaction]\n- evidence: implementation_plan.md\n- context:\n - Read: implementation_plan.md\n - Filtered userRules: error handling, edge cases, validation rules\n - Invariants (especially high-risk ones)\n - Feature brief: problem + acceptance criteria + non-goals\n- deliverable: plan-challenges.md\n\n**Delegation 3 — Execution Simulation:**\n- routine: routine-execution-simulation\n- entry_point: [Riskiest slice entry function]\n- inputs: [Expected inputs and state]\n- trace_depth: 3 (follow calls to understand failure modes)\n- context:\n - Read: implementation_plan.md (riskiest slice section)\n - Filtered userRules: performance, data flow, state management\n - Invariants touched by risky slice\n - Feature brief: architecture decision + risk register\n- deliverable: simulation-results.md\n\n**Self-check before delegating (required):**\n✅ Each delegation includes filtered userRules (not full list)\n✅ Each includes invariants + locks (if applicable)\n✅ Each includes feature brief (file refs or <500 word excerpt)\n✅ Each has specific focus/lens\n\n**If subagents + `rigorMode=STANDARD`:**\nDelegate ONCE using Plan Analysis with full context (not filtered).\n\n\n**Note:** delegationMode was detected in phase-0c and cached in CONTEXT.md\n**Else:** self-audit (same three lenses).\n\n**Output:**\n- Findings: Critical / Major / Minor\n- Plan amendments\n\n---\n\n**CLEAN-SLATE CHECK (STANDARD+, if findings exist):**\n\nBefore applying amendments, briefly answer:\n\n> \"If I started fresh right now, knowing everything I've learned, would I choose the same approach?\"\n\n1. Without looking at current plan, sketch in 1 sentence what approach you'd take\n2. Compare to `selectedApproach`:\n - **Same**: Proceed with amendments\n - **Minor variation**: Note the insight; consider incorporating\n - **Fundamentally different**: STOP. Set `cleanSlateDivergence = Major`\n\n**If fundamentally different:**\n- Document why fresh thinking differs\n- Return to Phase 3b with fresh approach as new candidate, OR\n- Document why current approach is still better despite fresh thinking\n\n---\n\n**REGRESSION CHECK (iteration 2+, if `resolvedFindings` is non-empty):**\n\nBefore running the forward-looking audit, verify each item in `resolvedFindings`:\n- Is the resolution still valid in the current plan?\n- Has the amendment been reverted or contradicted by subsequent changes?\n\nIf ANY regression found: add to `planFindings` with severity Critical and prefix \"REGRESSION: previously resolved finding reverted.\"\n\n---\n\n**PHILOSOPHY ALIGNMENT CHECK (mandatory, all modes):**\n\nReview the plan against the user's coding philosophy and design principles from `userRules`.\n\nThis evaluates DESIGN QUALITY — not plan consistency. Stale acceptance criteria, missing requirements, and coverage gaps are covered by the completeness audit above.\n\nIf no philosophy or design principles are found in `userRules`, skip this section and note \"No philosophy principles configured.\"\n\n**Required output format** (structured table):\nFor each violation or tension found:\n\n| Principle | Violation | Severity | Action |\n|-----------|-----------|----------|--------|\n| [Principle name from userRules] | [What violates it and why] | Red / Orange / Yellow | [Specific fix or justification needed] |\n\nSeverity guide:\n- **Red** (blocking) = must fix before implementation. Add to `planFindings`.\n- **Orange** (design quality) = should fix; document if intentionally accepted. Add to `planFindings`.\n- **Yellow** (tension) = tension between principles; document the tradeoff. Do NOT add to `planFindings` — these are informational only.\n\nChecklist — actively check: immutability, error handling model (Result/sealed vs exceptions), test doubles strategy (fakes vs mocks), dead code, naming clarity, abstraction level, type safety, exhaustiveness.\n\nIf NO violations found: explicitly state \"Philosophy check: no violations found\" with brief evidence (e.g., \"error handling uses Result<T> per philosophy; test doubles are fakes not mocks\"). Do NOT rubber-stamp. If you find zero violations on a non-trivial plan, double-check naming, dead code, and abstraction choices.\n\n**Set:** `planFindings`, `planAmendments`, `planConfidence` (1–10), `cleanSlateDivergence` (None/Minor/Major)",
321
321
  "requireConfirmation": false
322
322
  },
323
323
  {
324
324
  "id": "phase-5c-refocus-and-ticket-extraction",
325
325
  "title": "Refocus: Amendments + Tickets + Drift Detection",
326
- "prompt": "Apply amendments and refocus.\n\n**Do:**\n- Update `planArtifact` + `implementationPlan` to incorporate `planAmendments`.\n- Extract out-of-scope work into `followUpTickets`.\n- Ensure plan follows `invariants` and stays slice-oriented.\n\n**Drift detection:**\n- If user introduced new constraints/preferences, update `userRules` and log in `CONTEXT.md`.\n\n**CONTEXT LOGGING:** Update CONTEXT.md Decision Log (follow format from metaGuidance) - record amendments accepted/rejected and why, user pushback, and scope/rules/verification drift\n\n**Set:** `followUpTickets`\n\n**VERIFY:** plan is coherent and PR-sized by slice.",
326
+ "prompt": "Apply amendments and refocus.\n\n**Do:**\n- Update `planArtifact` + `implementationPlan` to incorporate `planAmendments`.\n- Extract out-of-scope work into `followUpTickets`.\n- Ensure plan follows `invariants` and stays slice-oriented.\n\n**RESOLVED FINDINGS LEDGER (required):**\n\nWhen applying amendments, maintain the `resolvedFindings` context variable:\n- For each finding resolved in this iteration, add an entry: { finding: \"...\", resolution: \"...\", iteration: N }\n- Cap at 10 entries (if exceeded, drop oldest entries first)\n- This ledger carries forward to the next audit pass for regression checking\n\n**Set:** `resolvedFindings` (array, append new resolutions)\n\n**Drift detection:**\n- If user introduced new constraints/preferences, update `userRules` and log in `CONTEXT.md`.\n\n**CONTEXT LOGGING:** Update CONTEXT.md Decision Log (follow format from metaGuidance) - record amendments accepted/rejected and why, user pushback, and scope/rules/verification drift\n\n**Set:** `followUpTickets`\n\n**VERIFY:** plan is coherent and PR-sized by slice.",
327
327
  "requireConfirmation": {
328
328
  "or": [
329
329
  {