@exaudeus/workrail 3.8.1 → 3.8.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "3.8.1",
3
+ "version": "3.8.2",
4
4
  "description": "Step-by-step workflow enforcement for AI agents via MCP",
5
5
  "license": "MIT",
6
6
  "repository": {
@@ -17,7 +17,7 @@
17
17
  "META DISTINCTION: you are authoring or modernizing a workflow, not executing one. Keep the authored workflow's concerns separate from this meta-workflow's execution.",
18
18
  "DEFAULT BEHAVIOR: self-execute with tools. Only ask the user for business decisions about the workflow being authored or modernized, not things you can learn from the schema, authoring spec, or example workflows.",
19
19
  "AUTHORED VOICE: prompts in the authored workflow must be user-voiced. No middleware narration, no pseudo-DSL, no tutorial framing, no teaching-product language.",
20
- "VOICE ADAPTATION: the lean coding workflow is one voice example, not the universal template. Adapt vocabulary and tone to the authored workflow's domain.",
20
+ "VOICE ADAPTATION: the lean coding workflow is one voice example, not the universal template. Copy structural patterns, not domain language. Adapt vocabulary and tone to the authored workflow's domain.",
21
21
  "VOICE EXAMPLES: Coding: 'Review the changes in this MR.' Ops: 'Check whether the pipeline is healthy.' Content: 'Read the draft and check the argument.' NOT: 'The system will now perform a comprehensive analysis of...'",
22
22
  "VALIDATION GATE: validate with real validators, not regex approximations. When validator output and authoring assumptions conflict, runtime wins.",
23
23
  "ARTIFACT STRATEGY: the workflow JSON file is the primary output. Intermediate notes go in output.notesMarkdown. Do not create extra planning artifacts unless the workflow is genuinely complex.",
@@ -87,7 +87,7 @@
87
87
  {
88
88
  "id": "phase-0-understand",
89
89
  "title": "Phase 0: Understand the Workflow to Author or Modernize",
90
- "prompt": "Before you write anything, understand what you're working on.\n\nStart by reading:\n- `workflow-schema` reference (legal structure)\n- `authoring-spec` reference (canonical authoring rules)\n- `authoring-guide-v2` reference (current v2 authoring principles)\n- `workflow-authoring-reference` reference (detailed structure patterns)\n- `lean-coding-workflow` reference (modern example to inspect)\n\nRead `routines-guide` too if you think the authored workflow may need delegation or template injection.\n\nThen decide what kind of authoring task this is:\n- `authoringMode`: `create` or `modernize_existing`\n\nIf `authoringMode = create`, understand:\n- What recurring task or problem should this workflow solve?\n- Who runs it and how often?\n- What does success look like?\n- What constraints exist (tools, permissions, domain rules)?\n\nIf `authoringMode = modernize_existing`, understand:\n- Which workflow file is being updated?\n- What should stay the same about its purpose?\n- What feels stale, legacy, repetitive, or misaligned with current authoring guidance?\n- What constraints apply to the modernization (keep file path, preserve compatibility, avoid broad rewrites, etc.)?\n\nExplore first. Use tools to understand the existing workflow, surrounding docs, and relevant domain context. Ask the user only what you genuinely cannot figure out yourself.\n\nThen classify:\n- `workflowComplexity`: Simple (linear, few steps) / Medium (branches, loops, or moderate step count) / Complex (multiple loops, delegation, extension points, many steps)\n- `rigorMode`: QUICK (simple linear workflow, low risk) / STANDARD (moderate complexity or domain risk) / THOROUGH (complex architecture, high stakes, needs review loops)\n\nCapture:\n- `authoringMode`\n- `workflowComplexity`\n- `rigorMode`\n- `taskDescription`\n- `intendedAudience`\n- `successCriteria`\n- `domainConstraints`\n- `targetWorkflowPath` (required for `modernize_existing`, otherwise empty)\n- `modernizationGoals` (required for `modernize_existing`, otherwise empty)\n- `openQuestions` (only real questions that need user input)",
90
+ "prompt": "Before you write anything, understand what you're working on.\n\nStart by reading:\n- `workflow-schema` reference (legal structure)\n- `authoring-spec` reference (canonical authoring rules)\n- `authoring-guide-v2` reference (current v2 authoring principles)\n- `workflow-authoring-reference` reference (detailed structure patterns)\n- `lean-coding-workflow` reference (modern example to inspect)\n\nRead `routines-guide` too if you think the authored workflow may need delegation or template injection.\n\nThen decide what kind of authoring task this is:\n- `authoringMode`: `create` or `modernize_existing`\n\nIf `authoringMode = create`, understand:\n- What recurring task or problem should this workflow solve?\n- Who runs it and how often?\n- What does success look like?\n- What constraints exist (tools, permissions, domain rules)?\n\nIf `authoringMode = modernize_existing`, understand:\n- Which workflow file is being updated?\n- What should stay the same about its purpose?\n- What feels stale, legacy, repetitive, or misaligned with current authoring guidance?\n- What constraints apply to the modernization (keep file path, preserve compatibility, avoid broad rewrites, etc.)?\n- Which modern example should act as the primary baseline, if any?\n\nFor `modernize_existing`, make an explicit baseline decision before architecture work:\n- choose exactly one `primaryBaseline` when a single modern example fits well\n- optional `secondaryBaselines` may be used for supporting patterns only\n- if no single baseline fits, set `primaryBaseline = none` and explain whether you are using a hybrid baseline or reasoning directly from schema + authoring guidance\n- list `patternsToBorrow` and `patternsToAvoid`\n\nRule:\n- baselines are models, not templates. Copy structural patterns, not another workflow's domain voice.\n\nExplore first. Use tools to understand the existing workflow, surrounding docs, and relevant domain context. Ask the user only what you genuinely cannot figure out yourself.\n\nThen classify:\n- `workflowComplexity`: Simple (linear, few steps) / Medium (branches, loops, or moderate step count) / Complex (multiple loops, delegation, extension points, many steps)\n- `rigorMode`: QUICK (simple linear workflow, low risk) / STANDARD (moderate complexity or domain risk) / THOROUGH (complex architecture, high stakes, needs review loops)\n\nCapture:\n- `authoringMode`\n- `workflowComplexity`\n- `rigorMode`\n- `taskDescription`\n- `intendedAudience`\n- `successCriteria`\n- `domainConstraints`\n- `targetWorkflowPath` (required for `modernize_existing`, otherwise empty)\n- `modernizationGoals` (required for `modernize_existing`, otherwise empty)\n- `primaryBaseline` (for `modernize_existing`, otherwise empty)\n- `secondaryBaselines` (for `modernize_existing`, otherwise empty)\n- `baselineDecisionRationale` (for `modernize_existing`, otherwise empty)\n- `patternsToBorrow` (for `modernize_existing`, otherwise empty)\n- `patternsToAvoid` (for `modernize_existing`, otherwise empty)\n- `openQuestions` (only real questions that need user input)",
91
91
  "requireConfirmation": true
92
92
  },
93
93
  {
@@ -97,7 +97,7 @@
97
97
  "var": "workflowComplexity",
98
98
  "not_equals": "Simple"
99
99
  },
100
- "prompt": "Decide the architecture before you write JSON.\n\nBased on what you learned in Phase 0, decide:\n\n1. **Step structure**: how many phases, what each one does, what order\n2. **Loops**: does any phase need iteration? If so, what are the exit rules and max iterations?\n\nLoop design heuristics:\n- Add a loop ONLY when: (a) a quality gate may fail on first pass (validation, review), (b) each pass adds measurable value (progressive refinement), or (c) external feedback requires re-execution.\n- Do NOT loop when: (a) the agent can get it right in one pass with sufficient context, or (b) the full workflow is cheap enough to re-run entirely.\n- Every loop needs: an explicit exit condition (not vibes), a bounded maxIterations, and a decision step with outputContract.\n- Sensible defaults: validation ≈ 2-3, review/refinement ≈ 2, user-feedback ≈ 2-3 with confirmation gate. Go higher only with explicit justification in your notes.\n3. **Confirmation gates**: where does the user genuinely need to approve before proceeding? Don't add confirmations as ceremony.\n4. **Delegation**: does any step benefit from subagent routines? If so, which ones and why? Keep delegation bounded.\n5. **Prompt composition**: will any steps need promptFragments for rigor-mode branching? Will any steps share enough structure to use templates?\n6. **Extension points**: are there customizable slots that projects might want to override (e.g., a verification routine, a review routine)?\n7. **References**: should the authored workflow declare its own references to external docs?\n8. **Artifacts**: what does each step produce? Which artifact is canonical for which concern?\n9. **metaGuidance**: what persistent behavioral rules should the agent see on start and resume?\n\nIf `authoringMode = modernize_existing`, also decide:\n- should this workflow be preserved mostly in place, restructured selectively, or rewritten more substantially?\n- which existing steps, loops, references, or metaGuidance should stay because they still fit the workflow's purpose?\n- which legacy patterns or repetitive sections should be removed or reshaped?\n- whether the file path should stay the same or whether a new variant/file is genuinely warranted\n\nWrite the shape as a structured outline in your notes. Include:\n- Phase list with titles and one-line goals\n- Which phases loop and why\n- Which phases have confirmation gates and why\n- Context variables that flow between phases\n- Artifact ownership (which artifact is canonical for what)\n- for `modernize_existing`: whether the plan is preserve-in-place, restructure, or rewrite-biased and why\n\nDon't write JSON yet.\n\nCapture:\n- `workflowOutline`\n- `loopDesign`\n- `confirmationDesign`\n- `delegationDesign`\n- `artifactPlan`\n- `contextModel` (the context variables the workflow will use and where they're set)\n- `voiceStrategy` (domain vocabulary, authority posture: directive/collaborative/supervisory, density calibration)\n- `modernizationStrategy` (for `modernize_existing`: preserve_in_place / restructure / rewrite, otherwise empty)",
100
+ "prompt": "Decide the architecture before you write JSON.\n\nBased on what you learned in Phase 0, decide:\n\n1. **Step structure**: how many phases, what each one does, what order\n2. **Loops**: does any phase need iteration? If so, what are the exit rules and max iterations?\n\nLoop design heuristics:\n- Add a loop ONLY when: (a) a quality gate may fail on first pass (validation, review), (b) each pass adds measurable value (progressive refinement), or (c) external feedback requires re-execution.\n- Do NOT loop when: (a) the agent can get it right in one pass with sufficient context, or (b) the full workflow is cheap enough to re-run entirely.\n- Every loop needs: an explicit exit condition (not vibes), a bounded maxIterations, and a decision step with outputContract.\n- Sensible defaults: validation ≈ 2-3, review/refinement ≈ 2, user-feedback ≈ 2-3 with confirmation gate. Go higher only with explicit justification in your notes.\n3. **Confirmation gates**: where does the user genuinely need to approve before proceeding? Don't add confirmations as ceremony.\n4. **Delegation and reuse**: for each phase, decide between direct execution, routine delegation, template injection, or no special mechanism. If a routine or template is not used, say why not. Keep delegation bounded and keep ownership with the main agent.\n5. **Prompt composition**: will any steps need promptFragments for rigor-mode branching? Will any steps share enough structure to use templates?\n6. **Extension points**: are there customizable slots that projects might want to override (e.g., a verification routine, a review routine)?\n7. **References**: should the authored workflow declare its own references to external docs?\n8. **Artifacts**: what does each step produce? Which artifact is canonical for which concern?\n9. **metaGuidance**: what persistent behavioral rules should the agent see on start and resume?\n\nIf `authoringMode = modernize_existing`, also decide:\n- should this workflow be preserved mostly in place, restructured selectively, or rewritten more substantially?\n- which existing steps, loops, references, or metaGuidance should stay because they still fit the workflow's purpose?\n- which legacy patterns or repetitive sections should be removed or reshaped?\n- whether the file path should stay the same or whether a new variant/file is genuinely warranted\n- how each major old phase or behavior maps to the new workflow: `keep`, `merge`, `remove`, or `replace`\n\nFor `modernize_existing`, create a compact legacy mapping in your notes. For each major old phase or behavior, record:\n- source step or behavior\n- disposition: `keep` / `merge` / `remove` / `replace`\n- rationale\n- destination in the new workflow, if any\n\nFor routine and template decisions, create a compact audit in your notes. For each meaningful phase or concern, record:\n- chosen mechanism: direct / routine / template / none\n- why it helps or why it would be overkill\n- the ownership boundary that stays with the main agent\n\nWrite the shape as a structured outline in your notes. Include:\n- Phase list with titles and one-line goals\n- Which phases loop and why\n- Which phases have confirmation gates and why\n- Context variables that flow between phases\n- Artifact ownership (which artifact is canonical for what)\n- for `modernize_existing`: whether the plan is preserve-in-place, restructure, or rewrite-biased and why\n\nDon't write JSON yet.\n\nCapture:\n- `workflowOutline`\n- `loopDesign`\n- `confirmationDesign`\n- `delegationDesign`\n- `artifactPlan`\n- `contextModel` (the context variables the workflow will use and where they're set)\n- `voiceStrategy` (domain vocabulary, authority posture: directive/collaborative/supervisory, density calibration)\n- `routineAudit`\n- `delegationBoundaries`\n- `templateInjectionPlan`\n- `modernizationStrategy` (for `modernize_existing`: preserve_in_place / restructure / rewrite, otherwise empty)\n- `legacyMapping` (for `modernize_existing`, otherwise empty)\n- `behaviorPreservationNotes` (for `modernize_existing`, otherwise empty)",
101
101
  "requireConfirmation": {
102
102
  "or": [
103
103
  { "var": "workflowComplexity", "not_equals": "Simple" },
@@ -169,7 +169,7 @@
169
169
  {
170
170
  "id": "phase-4-review",
171
171
  "title": "Phase 4: Method Review",
172
- "prompt": "The workflow is valid. Now check whether it's actually good.\n\nScore each dimension 0-2 with one sentence of evidence:\n\n- `voiceClarity`: 0 = prompts are direct user-voiced asks in the workflow's domain vocabulary, 1 = mostly user-voiced but borrows vocabulary from other domains or has middleware narration, 2 = reads like system documentation or sounds like a different domain\n- `ceremonyLevel`: 0 = confirmations only at real decision points, 1 = one or two unnecessary gates, 2 = over-asks the user or adds routine ceremony\n- `loopSoundness`: 0 = loops have explicit exit rules, bounded iterations, and real decision steps, 1 = minor issues with exit clarity, 2 = vibes-only exit conditions or unbounded loops (score 0 if no loops)\n- `delegationBoundedness`: 0 = delegation is bounded and explicit or absent, 1 = one delegation could be tighter, 2 = open-ended or ownership-transferring delegation (score 0 if no delegation)\n- `legacyPatterns`: 0 = no legacy anti-patterns, 1 = minor legacy residue, 2 = pseudo-DSL, learning paths, satisfaction loops, or regex-as-gate present\n- `artifactClarity`: 0 = clear what each artifact is for and which is canonical, 1 = mostly clear, 2 = ambiguous artifact ownership\n- `modeFit`: 0 = the workflow fits the selected `authoringMode`, 1 = minor creation/modernization mismatch remains, 2 = the workflow still reads like the wrong mode entirely\n\nIf the total score is 0-3: the workflow is ready.\nIf the total score is 4-6: fix the worst dimensions before proceeding.\nIf the total score is 7+: this needs significant rework. Fix the worst dimensions here, re-validate, and record what you would change if you could redraft from scratch.\n\nIf `authoringMode = modernize_existing`, check explicitly:\n- does the updated workflow preserve the right purpose?\n- did you remove legacy structure without rewriting valuable behavior away?\n- do any prompts, captures, or handoff notes still assume this was a brand-new workflow?\n\nFix any issues directly in the workflow file. Re-run validation if you changed structure.\n\nCapture:\n- `reviewScores`\n- `reviewPassed`\n- `fixesApplied`",
172
+ "prompt": "The workflow is valid. Now check whether it's actually good.\n\nScore each dimension 0-2 with one sentence of evidence:\n\n- `voiceClarity`: 0 = prompts are direct user-voiced asks in the workflow's domain vocabulary, 1 = mostly user-voiced but borrows vocabulary from other domains or has middleware narration, 2 = reads like system documentation or sounds like a different domain\n- `ceremonyLevel`: 0 = confirmations only at real decision points, 1 = one or two unnecessary gates, 2 = over-asks the user or adds routine ceremony\n- `loopSoundness`: 0 = loops have explicit exit rules, bounded iterations, and real decision steps, 1 = minor issues with exit clarity, 2 = vibes-only exit conditions or unbounded loops (score 0 if no loops)\n- `delegationBoundedness`: 0 = delegation is bounded and explicit or absent, 1 = one delegation could be tighter or a good routine/template opportunity was missed, 2 = open-ended or ownership-transferring delegation, or routine/template choices are unjustified (score 0 if no delegation and no reuse need exists)\n- `legacyPatterns`: 0 = no legacy anti-patterns, 1 = minor legacy residue, 2 = pseudo-DSL, learning paths, satisfaction loops, or regex-as-gate present\n- `artifactClarity`: 0 = clear what each artifact is for and which is canonical, 1 = mostly clear, 2 = ambiguous artifact ownership\n- `modeFit`: 0 = the workflow fits the selected `authoringMode`, 1 = minor creation/modernization mismatch remains, 2 = the workflow still reads like the wrong mode entirely\n- `modernizationDiscipline`: 0 = valuable behavior was preserved and legacy structure was removed cleanly, 1 = minor mismatch or over/under-preservation, 2 = either valuable behavior was lost or legacy structure still dominates (score 0 for `create` mode)\n\nIf the total score is 0-3: the workflow is ready.\nIf the total score is 4-6: fix the worst dimensions before proceeding.\nIf the total score is 7+: this needs significant rework. Fix the worst dimensions here, re-validate, and record what you would change if you could redraft from scratch.\n\nIf `authoringMode = modernize_existing`, check explicitly:\n- does the updated workflow preserve the right purpose?\n- did you remove legacy structure without rewriting valuable behavior away?\n- does the final workflow still align with `primaryBaseline` and `patternsToBorrow` without copying domain language?\n- does the final workflow respect the `legacyMapping`, especially for anything marked keep or merge?\n- do the routine/template choices still match the `routineAudit` and stay bounded?\n- do any prompts, captures, or handoff notes still assume this was a brand-new workflow?\n\nFix any issues directly in the workflow file. Re-run validation if you changed structure.\n\nCapture:\n- `reviewScores`\n- `reviewPassed`\n- `fixesApplied`",
173
173
  "promptFragments": [
174
174
  {
175
175
  "id": "phase-4-quick-skip",