@exaudeus/workrail 3.66.0 → 3.67.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,788 +0,0 @@
1
- {
2
- "id": "workflow-for-workflows",
3
- "name": "Workflow Authoring Workflow",
4
- "version": "2.5.0",
5
- "metricsProfile": "design",
6
- "description": "Use this to author or modernize a WorkRail workflow. Guides through understanding the task, defining effectiveness targets, designing architecture and quality gates, drafting, validating, assigning tags, and handing off.",
7
- "about": "## Workflow Authoring Workflow\n\nThis is the standard WorkRail workflow for creating a new workflow from scratch or modernizing an existing one. It is the trust gate for all other workflows: a workflow is not considered production-ready until it has passed through here.\n\n**What it does:**\nThe workflow walks through the full authoring lifecycle: understanding the task, choosing the right baseline and archetype, designing the phase and quality-gate architecture, drafting the workflow JSON, running structural validators, auditing state fields for bloat, simulating execution against real scenarios, running an adversarial quality review, and producing a final trust handoff. For modernization tasks it builds a value inventory first to ensure enforcement mechanisms, domain knowledge, and behavioral rules are preserved or equivalently replaced.\n\n**When to use it:**\n- You want to author a new WorkRail workflow for a recurring task or problem\n- You have an existing workflow that is outdated, uses legacy patterns (pseudo-DSL, regex validation, satisfaction-score loops), or produces shallow results\n- You want a workflow that will pass the WorkRail quality bar and be trusted to run in production\n\n**What it produces:**\nA validated, tagged workflow JSON file with a `validatedAgainstSpecVersion` stamp. A final trust handoff with readiness verdict, known failure modes, residual weaknesses, and testing guidance.\n\n**How to get good results:**\nDescribe the recurring task the workflow should solve, who will run it, and what a satisfying result looks like. For modernization, point to the existing workflow file. The workflow reads the schema and authoring spec itself -- you do not need to know the JSON format in advance.",
8
- "examples": [
9
- "Create a new workflow for conducting weekly engineering retrospectives",
10
- "Modernize the old exploration workflow to use v2 authoring patterns and remove pseudo-DSL",
11
- "Author a workflow for triaging and prioritizing incoming support tickets",
12
- "Upgrade the code review workflow to add adversarial reviewer families and hard gates"
13
- ],
14
- "recommendedPreferences": {
15
- "recommendedAutonomy": "guided",
16
- "recommendedRiskPolicy": "conservative"
17
- },
18
- "features": [
19
- "wr.features.subagent_guidance"
20
- ],
21
- "preconditions": [
22
- "User has a recurring task or problem a workflow should solve, or an existing workflow that should be modernized.",
23
- "Agent has access to file creation, editing, and terminal tools.",
24
- "Agent can run workflow validators such as `npm run validate:registry` or equivalent.",
25
- "Write output workflow JSON to ~/.workrail/workflows/<name>.json by default. Only write to a repo's workflows/ dir when the user explicitly asks to contribute a bundled workflow."
26
- ],
27
- "metaGuidance": [
28
- "REFERENCE HIERARCHY: treat workflow-schema as legal truth for structure. Treat authoring-spec as canonical current guidance for what makes a workflow good. Treat authoring-provenance as optional maintainer context only.",
29
- "META DISTINCTION: you are authoring or modernizing a workflow, not executing one. Keep the authored workflow's concerns separate from this meta-workflow's execution.",
30
- "QUALITY-GATE ROLE: this workflow is the trust gate for other workflows. It must optimize not only for validity and modern authoring, but also for task effectiveness, false-confidence resistance, and future maintainability.",
31
- "DEFAULT BEHAVIOR: self-execute with tools. Only ask the user for business decisions about the workflow being authored or modernized, not things you can learn from the schema, authoring spec, or example workflows.",
32
- "AUTHORED VOICE: prompts in the authored workflow must be user-voiced. No middleware narration, no pseudo-DSL, no tutorial framing, no teaching-product language.",
33
- "BASELINE DISCIPLINE: choose both an authoring baseline and an outcome baseline whenever possible. Copy structural patterns, not domain language.",
34
- "VALIDATION GATE: validate with real validators, not regex approximations. When validator output and authoring assumptions conflict, runtime wins.",
35
- "DEEP REVIEW: authoring integrity and outcome effectiveness are separate concerns. A workflow is not ready unless both pass.",
36
- "RIGOR: always run the deepest review path -- state economy audit, execution simulation, adversarial review, and redesign if hard gates fail. There is no reduced-rigor mode.",
37
- "ARTIFACT STRATEGY: the workflow JSON file is the primary output. Intermediate notes go in output.notesMarkdown. Do not create extra planning artifacts unless the workflow is genuinely complex.",
38
- "V2 DURABILITY: use output.notesMarkdown as the primary durable record. Do not mirror execution state into CONTEXT.md or markdown checkpoint files.",
39
- "ANTI-PATTERNS TO AVOID IN AUTHORED WORKFLOWS: no pseudo-function metaGuidance, no learning-path branching, no satisfaction-score loops, no heavy clarification batteries, no regex-as-primary-validation, no celebration phases.",
40
- "MODERNIZATION DISTINCTION: remove format problems (pseudo-DSL, regex, A/B phases). Preserve or equivalently replace behavioral mechanisms (forcing functions, hard gates, domain knowledge). Never silently drop a mechanism that prevents a real failure mode.",
41
- "EQUIVALENT REPLACEMENT: a replacement only qualifies if it prevents the same failure mode with similar enforcement strength. A rubric suggestion is not equivalent to a hard gate. Document the tradeoff explicitly when the replacement is weaker.",
42
- "NEVER COMMIT MARKDOWN FILES UNLESS USER EXPLICITLY ASKS.",
43
- "OUTPUT LOCATION: write workflow JSON to ~/.workrail/workflows/<name>.json unless user asks to contribute a bundled workflow. Never write to a repo's workflows/ dir by default -- it may be public."
44
- ],
45
- "references": [
46
- {
47
- "id": "workflow-schema",
48
- "title": "Workflow JSON Schema",
49
- "source": "spec/workflow.schema.json",
50
- "resolveFrom": "package",
51
- "purpose": "Canonical schema for validating structure and field semantics while authoring workflows. This is legal truth.",
52
- "authoritative": true
53
- },
54
- {
55
- "id": "authoring-spec",
56
- "title": "Workflow Authoring Specification",
57
- "source": "spec/authoring-spec.json",
58
- "resolveFrom": "package",
59
- "purpose": "Canonical authoring guidance, rule levels, and examples for composing workflows correctly. This is current authoring truth.",
60
- "authoritative": true
61
- },
62
- {
63
- "id": "authoring-provenance",
64
- "title": "Workflow Authoring Provenance",
65
- "source": "spec/authoring-spec.provenance.json",
66
- "resolveFrom": "package",
67
- "purpose": "Source-of-truth map showing what is canonical, derived, and non-canonical in workflow authoring guidance. Optional maintainer context.",
68
- "authoritative": false
69
- },
70
- {
71
- "id": "authoring-guide-v2",
72
- "title": "Workflow Authoring Guide (v2)",
73
- "source": "docs/authoring-v2.md",
74
- "resolveFrom": "package",
75
- "purpose": "Current v2 authoring principles, references guidance, and durable execution patterns.",
76
- "authoritative": true
77
- },
78
- {
79
- "id": "workflow-authoring-reference",
80
- "title": "Workflow Authoring Reference",
81
- "source": "docs/design/workflow-authoring-v2.md",
82
- "resolveFrom": "package",
83
- "purpose": "Detailed v2 workflow authoring patterns for loops, conditions, references, and workflow structure.",
84
- "authoritative": true
85
- },
86
- {
87
- "id": "routines-guide",
88
- "title": "Routines Guide",
89
- "source": "docs/design/routines-guide.md",
90
- "resolveFrom": "package",
91
- "purpose": "Current guide for deciding when to use delegation, direct execution, or template injection in authored workflows.",
92
- "authoritative": false
93
- },
94
- {
95
- "id": "lean-coding-workflow",
96
- "title": "Lean Coding Workflow (Modern Authoring Baseline)",
97
- "source": "workflows/coding-task-workflow-agentic.lean.v2.json",
98
- "resolveFrom": "package",
99
- "purpose": "Strong modern example for engine-native authoring patterns, loop semantics, prompt density, and bounded delegation.",
100
- "authoritative": false
101
- },
102
- {
103
- "id": "mr-review-workflow",
104
- "title": "MR Review Workflow (Outcome Baseline Example)",
105
- "source": "workflows/mr-review-workflow.agentic.v2.json",
106
- "resolveFrom": "package",
107
- "purpose": "Strong example of hypothesis, neutral fact packet, reviewer families, contradiction synthesis, final validation, and a three-dimension orthogonal assessment gate.",
108
- "authoritative": false
109
- },
110
- {
111
- "id": "bug-investigation-workflow",
112
- "title": "Bug Investigation Workflow (Assessment Gate Example)",
113
- "source": "workflows/bug-investigation.agentic.v2.json",
114
- "resolveFrom": "package",
115
- "purpose": "Simpler example of a single-dimension assessment gate on a final validation step before handoff.",
116
- "authoritative": false
117
- },
118
- {
119
- "id": "readiness-audit-workflow",
120
- "title": "Production Readiness Audit (Audit Baseline Example)",
121
- "source": "workflows/production-readiness-audit.json",
122
- "resolveFrom": "package",
123
- "purpose": "Example of a thorough evidence-driven audit workflow with explicit reviewer-family structure and confidence handling.",
124
- "authoritative": false
125
- }
126
- ],
127
- "steps": [
128
- {
129
- "id": "phase-0-understand-and-classify",
130
- "title": "Phase 0: Understand and Classify the Authoring Task",
131
- "promptBlocks": {
132
- "goal": "Understand what workflow you are authoring or modernizing, and classify the task before you design anything.",
133
- "constraints": [
134
- [
135
- {
136
- "kind": "ref",
137
- "refId": "wr.refs.notes_first_durability"
138
- }
139
- ],
140
- "Explore first. Ask the user only what you genuinely cannot determine with tools and references.",
141
- "Choose baselines as models, not templates. Copy structural patterns, not another workflow's domain voice."
142
- ],
143
- "procedure": [
144
- "Read the schema, authoring spec, v2 authoring guides, and the strongest relevant example workflows.",
145
- "Classify the target workflow archetype: `review_audit`, `coding_execution`, `diagnostic_investigation`, `planning_design`, `linear_operational`, or `content_analysis`.",
146
- "Classify `workflowComplexity`: Simple, Medium, or Complex.",
147
- "Choose an `authoringBaseline` for engine-native authoring quality and an `outcomeBaseline` for the kind of job the authored workflow should perform. If no good baseline exists for one of them, set it to `none` and explain why."
148
- ],
149
- "outputRequired": {
150
- "notesMarkdown": "Task understanding, baseline choices, patterns to borrow or avoid, and any real open questions.",
151
- "context": "Capture authoringMode, workflowArchetype, workflowComplexity, taskDescription, intendedAudience, successCriteria, domainConstraints, targetWorkflowPath, modernizationGoals, authoringBaseline, outcomeBaseline, baselineDecisionRationale, authoringPatternsToBorrow, outcomePatternsToBorrow, patternsToAvoid, openQuestions, and valueInventory (modernize_existing only)."
152
- },
153
- "verify": [
154
- "The task is understood well enough to design the workflow without guessing blindly.",
155
- "Both authoring and outcome baselines are explicit, or their absence is justified."
156
- ]
157
- },
158
- "requireConfirmation": true,
159
- "promptFragments": [
160
- {
161
- "id": "phase-0-modernize-inventory",
162
- "when": {
163
- "var": "authoringMode",
164
- "equals": "modernize_existing"
165
- },
166
- "text": "Decide `authoringMode`: `create` or `modernize_existing`."
167
- },
168
- {
169
- "id": "phase-0-modernize-scope",
170
- "when": {
171
- "var": "authoringMode",
172
- "equals": "modernize_existing"
173
- },
174
- "text": "If `authoringMode = modernize_existing`, build a value inventory BEFORE forming opinions about what to change. Read the original and classify each meaningful mechanism: (1) enforcement mechanisms (forcing functions, hard gates, required outputs), (2) domain knowledge (problem-specific principles the agent would not otherwise know), (3) behavioral rules (persistent constraints on how the agent works). This inventory is the preservation checklist."
175
- }
176
- ]
177
- },
178
- {
179
- "id": "phase-1-define-effectiveness-target",
180
- "title": "Phase 1: Define the Effectiveness Target",
181
- "promptBlocks": {
182
- "goal": "Define what success should feel like for the authored workflow, not just what fields it should contain.",
183
- "constraints": [
184
- "Be specific about user satisfaction and dangerous false-confidence outcomes.",
185
- "Distinguish a technically valid workflow from a satisfying one."
186
- ],
187
- "procedure": [
188
- "State what result the authored workflow should reliably produce for its user.",
189
- "List the criteria that would make the workflow feel genuinely satisfying in practice.",
190
- "Name the biggest likely failure mode and the most dangerous false-confidence mode.",
191
- "State what would make the workflow technically correct but still disappointing."
192
- ],
193
- "outputRequired": {
194
- "notesMarkdown": "Effectiveness target, satisfaction criteria, failure modes, and false-confidence risks.",
195
- "context": "Capture effectivenessTarget, userSatisfactionCriteria, primaryFailureMode, dangerousFalseConfidenceModes, likelyWeakOutcomeModes, and trustRisk."
196
- },
197
- "verify": [
198
- "The authored workflow now has a clear outcome bar, not just an authoring bar."
199
- ]
200
- },
201
- "requireConfirmation": false
202
- },
203
- {
204
- "id": "phase-2-design-workflow-architecture",
205
- "title": "Phase 2: Design the Workflow Architecture",
206
- "promptBlocks": {
207
- "goal": "Decide the workflow architecture before you write JSON.",
208
- "constraints": [
209
- "Separate workflow architecture from quality-gate architecture. This phase is about the authored workflow itself.",
210
- "Keep delegation bounded and keep ownership with the main agent."
211
- ],
212
- "procedure": [
213
- "Decide the phase list, one-line goal for each phase, and overall ordering.",
214
- "Design loops with explicit exit rules, bounded maxIterations, and real reasons for another pass.",
215
- "Decide confirmation gates, delegation vs template injection vs direct execution, promptFragments, references, artifacts, and metaGuidance."
216
- ],
217
- "outputRequired": {
218
- "notesMarkdown": "Structured workflow outline, loop design, confirmation design, delegation design, artifact plan, and modernization mapping.",
219
- "context": "Capture workflowOutline, loopDesign, confirmationDesign, delegationDesign, artifactPlan, contextModel, voiceStrategy, routineAudit, delegationBoundaries, templateInjectionPlan, modernizationStrategy, legacyMapping, behaviorPreservationNotes, and valuePreservationMap (modernize_existing only)."
220
- },
221
- "verify": [
222
- "The authored workflow architecture is coherent before JSON drafting begins."
223
- ]
224
- },
225
- "promptFragments": [
226
- {
227
- "id": "phase-2-simple-scope",
228
- "when": {
229
- "var": "workflowComplexity",
230
- "equals": "Simple"
231
- },
232
- "text": "This is a Simple workflow. Keep the architecture lightweight: a flat phase list, no loops required, minimal confirmations, no delegation. The goal is clarity and direct execution, not structural sophistication."
233
- },
234
- {
235
- "id": "phase-2-simple-direct",
236
- "when": {
237
- "var": "workflowComplexity",
238
- "equals": "Simple"
239
- },
240
- "text": "For Simple workflows, keep the architecture linear and compact. Do not invent loops or ceremony unless the task truly needs them."
241
- },
242
- {
243
- "id": "phase-2-modernize-mapping",
244
- "when": {
245
- "var": "authoringMode",
246
- "equals": "modernize_existing"
247
- },
248
- "text": "If `authoringMode = modernize_existing`, decide preserve-in-place, restructure, or rewrite. For each item in valueInventory, record: `preserved` (structurally present with equivalent enforcement), `replaced` (new mechanism prevents same failure mode -- justify equivalence), or `dropped` (intentionally removed -- justify the loss). Phase-level mapping alone is insufficient; track what was inside each restructured or removed phase."
249
- }
250
- ],
251
- "requireConfirmation": {
252
- "var": "workflowComplexity",
253
- "not_equals": "Simple"
254
- }
255
- },
256
- {
257
- "id": "phase-3-design-quality-architecture",
258
- "title": "Phase 3: Design the Quality-Gate Architecture",
259
- "promptBlocks": {
260
- "goal": "Design how the authored workflow will avoid shallow results, false confidence, and state bloat.",
261
- "constraints": [
262
- "This phase is about the authored workflow's quality model, not its basic phase list.",
263
- "Prefer explicit quality structure over hoping the agent will infer it."
264
- ],
265
- "procedure": [
266
- "Decide whether the authored workflow needs a hypothesis step, neutral fact packet, reviewer or validator families, contradiction loop, final validation bundle, or explicit blind-spot handling.",
267
- "Design the confidence model, blind-spot model, and state economy plan.",
268
- "Decide the hard-gate dimensions that would make the authored workflow unsafe or unsatisfying if they fail. If hard gates exist, implement them using the native `assessments` + `assessmentRefs` + `assessmentConsequences` schema fields rather than informal notes or `requireConfirmation` alone. Each dimension should capture a distinct orthogonal failure mode -- not restate the workflow's existing confidence band. See `mr-review-workflow.agentic.v2.json` and `bug-investigation.agentic.v2.json` as exemplars.",
269
- "Write the redesign triggers that should force architectural revision rather than cosmetic refinement."
270
- ],
271
- "outputRequired": {
272
- "notesMarkdown": "Quality architecture, confidence model, blind-spot model, state economy plan, and hard-gate triggers.",
273
- "context": "Capture qualityArchitecture, confidenceModel, blindSpotModel, stateEconomyPlan, reviewBundlePlan, qualityGateTriggers, and hardGateModel."
274
- },
275
- "verify": [
276
- "The authored workflow has an explicit plan for false-confidence resistance and quality review."
277
- ]
278
- },
279
- "promptFragments": [
280
- {
281
- "id": "phase-3-simple-scope",
282
- "when": {
283
- "var": "workflowComplexity",
284
- "equals": "Simple"
285
- },
286
- "text": "This is a Simple workflow. Quality architecture should be proportional: identify one primary false-confidence mode and one key hard gate. Skip full reviewer family design -- a single self-executed review pass is sufficient."
287
- }
288
- ]
289
- },
290
- {
291
- "id": "phase-4-draft-or-revise",
292
- "title": "Phase 4: Draft or Revise the Workflow",
293
- "promptBlocks": {
294
- "goal": "Write the workflow JSON file using the architecture and quality model you already chose.",
295
- "constraints": [
296
- "The schema defines what is legal. The authoring spec defines what is good.",
297
- "Write prompts in the user's voice. Vary prompt density by step needs rather than using one density everywhere.",
298
- "If you are modernizing, preserve what still fits the workflow's purpose. Do not rewrite just because a workflow is old."
299
- ],
300
- "procedure": [
301
- "If `authoringMode = create` and no filename was specified, ask the user for the filename before writing.",
302
- "If `authoringMode = modernize_existing`, default to editing `targetWorkflowPath` unless there is a strong reason to create a new variant or file.",
303
- "Write the workflow file. Keep protocol requirements explicit, loops bounded, confirmations meaningful, and metaGuidance clean."
304
- ],
305
- "outputRequired": {
306
- "notesMarkdown": "Draft status and any notable authoring choices that are important to later review.",
307
- "context": "Capture workflowFilePath and draftComplete."
308
- },
309
- "verify": [
310
- "The workflow file exists and reflects the chosen architecture rather than an improvised one."
311
- ]
312
- },
313
- "promptFragments": [
314
- {
315
- "id": "phase-4-simple-fast",
316
- "when": {
317
- "var": "workflowComplexity",
318
- "equals": "Simple"
319
- },
320
- "text": "For Simple workflows, keep the file compact and linear. Do not create extra metaGuidance or loops unless the task truly needs them."
321
- }
322
- ],
323
- "requireConfirmation": false
324
- },
325
- {
326
- "id": "phase-5-validate",
327
- "type": "loop",
328
- "title": "Phase 5: Structural Validation Loop",
329
- "loop": {
330
- "type": "while",
331
- "conditionSource": {
332
- "kind": "artifact_contract",
333
- "contractRef": "wr.contracts.loop_control",
334
- "loopId": "validation_loop"
335
- },
336
- "maxIterations": 3
337
- },
338
- "body": [
339
- {
340
- "id": "phase-5a-run-validation",
341
- "title": "Run Validation",
342
- "promptBlocks": {
343
- "goal": "Run the real workflow validators against the drafted workflow.",
344
- "constraints": [
345
- "Do not rely on reading the JSON and eyeballing it.",
346
- "If runtime and authoring assumptions conflict, runtime wins."
347
- ],
348
- "procedure": [
349
- "Run the available validation tools or commands such as `npm run validate:registry`, schema validation, or the MCP validation surface.",
350
- "If validation fails, list the actual errors, fix them in the workflow file, and re-run validation.",
351
- "If validation passes cleanly, say so plainly."
352
- ],
353
- "outputRequired": {
354
- "notesMarkdown": "Validation results, actual errors if any, and what was fixed.",
355
- "context": "Capture validationErrors and validationPassed."
356
- },
357
- "verify": [
358
- "Validation results are based on real validators, not approximations."
359
- ]
360
- },
361
- "promptFragments": [
362
- {
363
- "id": "phase-5a-thorough",
364
- "text": "After structural validation passes, also check the workflow manually against required-level authoring-spec rules and fix any failures before moving on."
365
- }
366
- ],
367
- "requireConfirmation": false
368
- },
369
- {
370
- "id": "phase-5b-loop-decision",
371
- "title": "Validation Loop Decision",
372
- "promptBlocks": {
373
- "goal": "Decide whether structural validation needs another pass.",
374
- "constraints": [
375
- "Use validator state, not vibes."
376
- ],
377
- "procedure": [
378
- "If all errors are fixed and validation passes, stop.",
379
- "If you fixed errors but have not re-validated yet, continue.",
380
- "If you hit the iteration limit, stop and record what remains."
381
- ],
382
- "outputRequired": {
383
- "artifact": "Emit a `wr.loop_control` artifact for `validation_loop` with `decision` set to `continue` or `stop`."
384
- },
385
- "verify": [
386
- "The loop decision matches the actual validation state."
387
- ]
388
- },
389
- "outputContract": {
390
- "contractRef": "wr.contracts.loop_control"
391
- },
392
- "requireConfirmation": false
393
- }
394
- ]
395
- },
396
- {
397
- "id": "phase-5-escalation",
398
- "title": "Validation Escalation",
399
- "runCondition": {
400
- "var": "validationPassed",
401
- "equals": false
402
- },
403
- "promptBlocks": {
404
- "goal": "Stop execution. A structurally broken workflow must not proceed to the quality gate loop. Surface the errors and require user intervention.",
405
- "constraints": [
406
- "Present the situation honestly."
407
- ],
408
- "procedure": [
409
- "List the remaining validation errors and assess their severity.",
410
- "Stop and require user intervention. Do not proceed into the quality gate loop with a structurally broken workflow. The user must explicitly decide how to resolve each remaining error before this workflow can continue."
411
- ]
412
- },
413
- "requireConfirmation": true
414
- },
415
- {
416
- "id": "phase-6-quality-gate-loop",
417
- "type": "loop",
418
- "title": "Phase 6: Quality-Gate Loop",
419
- "loop": {
420
- "type": "while",
421
- "conditionSource": {
422
- "kind": "artifact_contract",
423
- "contractRef": "wr.contracts.loop_control",
424
- "loopId": "quality_gate_loop"
425
- },
426
- "maxIterations": 3
427
- },
428
- "body": [
429
- {
430
- "id": "phase-6a-state-economy-audit",
431
- "title": "State Economy Audit",
432
- "promptBlocks": {
433
- "goal": "Check whether every context field in the authored workflow earns its keep.",
434
- "constraints": [
435
- "A field is justified only if it materially affects routing, synthesis, confidence, or handoff quality.",
436
- "Do not keep bookkeeping fields just because they sound organized."
437
- ],
438
- "procedure": [
439
- "For each meaningful captured context field, record where it is set, where it is consumed, what decision or outcome it influences, and what gets worse if it is removed.",
440
- "Classify each field as `keep`, `wire`, or `remove`.",
441
- "Fix weak or unused fields directly in the workflow file."
442
- ],
443
- "outputRequired": {
444
- "notesMarkdown": "State field audit with keep/wire/remove decisions and any fixes applied.",
445
- "context": "Capture stateFieldAudit, unusedOrWeakFields, and stateEconomyPassed."
446
- },
447
- "verify": [
448
- "Weak or unused fields are either wired meaningfully or removed."
449
- ]
450
- },
451
- "requireConfirmation": false,
452
- "assessmentRefs": [
453
- "state-economy-gate"
454
- ],
455
- "assessmentConsequences": [
456
- {
457
- "when": {
458
- "anyEqualsLevel": "low"
459
- },
460
- "effect": {
461
- "kind": "require_followup",
462
- "guidance": "state_economy low -- one or more context fields are unused, weakly consumed, or carry no decision weight. Remove or wire them before proceeding: trace a concrete downstream use for each field, or delete it from the workflow."
463
- }
464
- }
465
- ]
466
- },
467
- {
468
- "id": "phase-6b-execution-simulation",
469
- "title": "Execution Simulation",
470
- "promptBlocks": {
471
- "goal": "Simulate what would happen if the authored workflow ran on the user's real task.",
472
- "constraints": [
473
- "This is about practical utility, not only context-flow correctness.",
474
- "Flag places where the workflow would produce paperwork, generic output, or false confidence instead of value."
475
- ],
476
- "procedure": [
477
- "Trace the authored workflow step by step against the user's actual task or the closest realistic scenario.",
478
- "For each step, ask: what would the agent actually do, what context would it have, what would it likely produce, and what would the next step inherit?",
479
- "Also trace at least one degraded or edge-case path -- not just the happy path. Ask: what happens when a condition evaluates unexpectedly, a loop has nothing to iterate, a runCondition skips a phase, or the user provides minimal input? Quality gates that only protect the happy path are not quality gates.",
480
- "Identify likely weak steps, likely unsatisfying outputs, and likely false-confidence modes.",
481
- "For any loop in the workflow, explicitly check: does the exit condition have structural teeth (artifact contract, bounded maxIterations), or does it rely on prose instructions the engine cannot enforce?",
482
- "Fix issues directly in the workflow file when the right improvement is clear."
483
- ],
484
- "outputRequired": {
485
- "notesMarkdown": "Execution simulation findings, likely weak steps, unsatisfying outputs, false-confidence risks, and any fixes applied.",
486
- "context": "Capture simulationFindings, likelyWeakSteps, likelyUnsatisfyingOutputs, falseConfidenceFindings, and outcomeEffectivenessPassed."
487
- },
488
- "verify": [
489
- "The simulation judges likely usefulness, not just structural legality."
490
- ]
491
- },
492
- "promptFragments": [
493
- {
494
- "id": "phase-6b-modernize-check",
495
- "when": {
496
- "var": "authoringMode",
497
- "equals": "modernize_existing"
498
- },
499
- "text": "For modernize_existing: after tracing the workflow forward, check each item in valueInventory. For each enforcement mechanism and domain knowledge item: would the modernized workflow produce the same behavior? Any item where the answer is no or weaker is a loss -- fix it directly or record the accepted tradeoff with justification."
500
- }
501
- ],
502
- "requireConfirmation": false,
503
- "assessmentRefs": [
504
- "simulation-outcome-gate"
505
- ],
506
- "assessmentConsequences": [
507
- {
508
- "when": {
509
- "anyEqualsLevel": "low"
510
- },
511
- "effect": {
512
- "kind": "require_followup",
513
- "guidance": "simulation_outcome low -- the simulation found likely unsatisfying or false-confidence outputs that were not fixed inline. Address the identified weak steps or degraded-path failures before the adversarial review."
514
- }
515
- }
516
- ]
517
- },
518
- {
519
- "id": "phase-6c-adversarial-quality-review",
520
- "title": "Adversarial Quality Review",
521
- "promptBlocks": {
522
- "goal": "Review the authored workflow as a quality gate, not just as a valid JSON file.",
523
- "constraints": [
524
- "Authoring integrity and outcome effectiveness are separate concerns. Score both.",
525
- "Reviewer-family or validator output is evidence, not authority."
526
- ],
527
- "procedure": [
528
- "Score these dimensions 0-2 with one sentence of evidence each: `voiceClarity`, `ceremonyLevel`, `loopSoundness`, `delegationBoundedness`, `artifactClarity`, `taskEffectiveness`, `falseConfidenceResistance`, `stateMinimality`, `coverageSharpness`, `domainFit`, `handoffUtility`, and `complexityScaling` (0 = has appropriate fast paths or scope-sensitive branching for simpler inputs; 2 = single-weight, over-engineers simple cases)2 = single-weight), `enforcementStrength` (0 = behavioral rules have structural teeth; 2 = important rules are prose-only with no enforcement mechanism), and `modernizationDiscipline` (0 = every valueInventory item preserved, equivalently replaced with justification, or dropped with justification; 2 = items missing or replaced with weaker versions without justification -- score 0 for create mode).",
529
- "Run an adversarial review bundle with these lenses: `engine_native_reviewer`, `task_effectiveness_reviewer`, `state_economy_reviewer`, `false_confidence_reviewer`, `domain_fit_reviewer`, and `maintainer_reviewer`.",
530
- "Synthesize what the review confirmed, what it challenged, and what changed your mind.",
531
- "When scoring `falseConfidenceResistance`, explicitly check: do the workflow's quality gates protect edge cases and degraded paths, or only the happy path? A workflow that passes its own checks on ideal input but fails silently on minimal or unexpected input scores 2.",
532
- "Set hard-gate failures whenever any of these are materially weak: `taskEffectiveness`, `falseConfidenceResistance`, `stateMinimality`, `coverageSharpness`, `domainFit`, or `handoffUtility`, and `complexityScaling` (0 = has appropriate fast paths or scope-sensitive branching for simpler inputs; 2 = single-weight, over-engineers simple cases).",
533
- "Set `authoringIntegrityPassed = true` only if structural and authoring-quality dimensions are all acceptable. Set `outcomeEffectivenessPassed = true` only if the workflow is likely to achieve satisfying results for the user."
534
- ],
535
- "outputRequired": {
536
- "notesMarkdown": "Quality review scores, adversarial review findings, hard-gate failures, and the current redesign severity.",
537
- "context": "Capture reviewScores, hardGateFailures, authoringIntegrityPassed, outcomeEffectivenessPassed, qualityReviewSummary, and redesignSeverity."
538
- },
539
- "verify": [
540
- "Hard gates reflect real user-trust risk, not cosmetic imperfections."
541
- ]
542
- },
543
- "promptFragments": [
544
- {
545
- "id": "phase-6c-rigor",
546
- "text": "Always use adversarial reviewer lanes. Assume the first review is not enough -- use the full adversarial bundle unless a hard limitation makes it impossible."
547
- },
548
- {
549
- "id": "phase-6c-heritage-review",
550
- "when": {
551
- "var": "authoringMode",
552
- "equals": "modernize_existing"
553
- },
554
- "text": "For modernize_existing: add a heritage_reviewer to the adversarial bundle. Its job is to check each valueInventory item and find what was lost or weakened -- it ignores format improvements. It must answer: which enforcement mechanisms are now prose-only? Which domain knowledge items are absent? Which behavioral rules were removed without equivalent replacement? Heritage_reviewer findings drive enforcementStrength and modernizationDiscipline scores."
555
- }
556
- ],
557
- "requireConfirmation": false,
558
- "validationCriteria": [
559
- {
560
- "type": "contains",
561
- "value": "complexityScaling",
562
- "message": "Review must score complexityScaling"
563
- }
564
- ],
565
- "assessmentRefs": [
566
- "authoring-integrity-gate",
567
- "outcome-effectiveness-gate"
568
- ],
569
- "assessmentConsequences": [
570
- {
571
- "when": {
572
- "anyEqualsLevel": "low"
573
- },
574
- "effect": {
575
- "kind": "require_followup",
576
- "guidance": "authoring_integrity low -- structural or quality dimensions are unacceptable; fix voice, ceremony, loop, delegation, or state problems before this workflow can be trusted. outcome_effectiveness low -- task effectiveness, false-confidence resistance, coverage sharpness, or domain fit are materially weak; redesign, do not patch."
577
- }
578
- }
579
- ]
580
- },
581
- {
582
- "id": "phase-6d-redesign-and-revalidate",
583
- "title": "Redesign and Revalidate",
584
- "promptBlocks": {
585
- "goal": "If hard gates fail, redesign the workflow instead of polishing around the problem.",
586
- "constraints": [
587
- "Minor cosmetic refinement is not enough when task effectiveness or false-confidence resistance is weak.",
588
- "If structure changes, re-run real validators before leaving this step."
589
- ],
590
- "procedure": [
591
- "Otherwise classify the needed redesign severity as `minor`, `architectural`, or `unsafe_to_ship` and apply the necessary fixes directly to the workflow file.",
592
- "If the redesign changed structure, run the real validators again and update the validation state before leaving this step."
593
- ],
594
- "outputRequired": {
595
- "notesMarkdown": "Redesign actions taken, why they were needed, and whether revalidation passed.",
596
- "context": "Capture redesignApplied, validationPassed, and remainingConcerns."
597
- },
598
- "verify": [
599
- "Structural redesign problems are handled as redesign problems, not cosmetic ones."
600
- ]
601
- },
602
- "requireConfirmation": false,
603
- "runCondition": {
604
- "or": [
605
- {
606
- "var": "authoringIntegrityPassed",
607
- "equals": false
608
- },
609
- {
610
- "var": "outcomeEffectivenessPassed",
611
- "equals": false
612
- }
613
- ]
614
- }
615
- },
616
- {
617
- "id": "phase-6e-quality-loop-decision",
618
- "title": "Quality Loop Decision",
619
- "promptBlocks": {
620
- "goal": "Decide whether the quality-gate loop needs another pass.",
621
- "constraints": [
622
- "Use hard gates and actual remaining concerns, not vibes."
623
- ],
624
- "procedure": [
625
- "Continue if `authoringIntegrityPassed = false`.",
626
- "Otherwise continue if `outcomeEffectivenessPassed = false`.",
627
- "Otherwise continue if `hardGateFailures` is not empty.",
628
- "Otherwise continue if `redesignSeverity` is `architectural` or `unsafe_to_ship` and you have not yet re-reviewed the redesigned workflow.",
629
- "Otherwise continue if `validationPassed = false` after redesign.",
630
- "Otherwise stop."
631
- ],
632
- "outputRequired": {
633
- "artifact": "Emit a `wr.loop_control` artifact for `quality_gate_loop` with `decision` set to `continue` or `stop`."
634
- },
635
- "verify": [
636
- "The workflow does not stop while hard trust problems remain."
637
- ]
638
- },
639
- "outputContract": {
640
- "contractRef": "wr.contracts.loop_control"
641
- },
642
- "requireConfirmation": false
643
- }
644
- ]
645
- },
646
- {
647
- "id": "phase-7a-assign-tags",
648
- "title": "Phase 7a: Assign Tags",
649
- "promptBlocks": {
650
- "goal": "Register the workflow in the catalog: assign tags in spec/workflow-tags.json and write about and examples fields into the workflow JSON so humans and agents can discover and understand the workflow.",
651
- "procedure": [
652
- "Read spec/workflow-tags.json to see the available tags and their 'when' phrases.",
653
- "Based on the workflow's purpose and description, select 1-3 tags from the closed set (coding, review_audit, investigation, design, documentation, tickets, learning, routines, authoring).",
654
- "Check whether the workflow ID already exists in the `workflows` section. If it does, update the existing entry tags rather than adding a duplicate. If it does not exist, add a new entry under 'workflows' in spec/workflow-tags.json: { \"tags\": [\"<tag1>\"] }.",
655
- "If the workflow is a test fixture or internal utility not meant for end-user discovery, add 'hidden': true.",
656
- "Save the tags file. Do not modify any other field.",
657
- "Write the 'about' field into the workflow JSON: a markdown string (100-400 words) written for a human deciding whether to use this workflow. Cover what it does, when to use it, what it produces, and how to get good results. This is a user-facing surface -- not agent instructions (use metaGuidance for that).",
658
- "Write the 'examples' field into the workflow JSON: an array of 2-4 short, concrete goal strings (10-120 chars each) showing what this workflow is used for. Each example should be specific enough to be informative -- not generic ('implement a feature'). These appear in list_workflows output so agents can communicate concrete goal phrasing to users.",
659
- "Skip 'about' and 'examples' only if the workflow is marked hidden: true."
660
- ],
661
- "constraints": [
662
- "Only use tags from the closed set. Do not invent new tags.",
663
- "If the workflow already has an entry in the tags file, update it rather than adding a duplicate.",
664
- "Tags should reflect what the workflow does, not what it is named.",
665
- "Write 'about' for humans, not agents -- do not copy metaGuidance or step prompt text into it.",
666
- "Examples must be specific to this workflow; reject generic examples that would fit any workflow."
667
- ],
668
- "outputRequired": {
669
- "notesMarkdown": "List the assigned tags with a one-line justification for each. Confirm about and examples were written."
670
- }
671
- },
672
- "requireConfirmation": false
673
- },
674
- {
675
- "id": "phase-7b-declare-metrics-profile",
676
- "title": "Phase 7b: Declare metricsProfile",
677
- "promptBlocks": {
678
- "goal": "Declare the metricsProfile field in the authored workflow JSON, or explicitly justify omitting it. The metricsProfile field enables engine-injected metrics instrumentation footers in step prompts. When set, the engine injects context key accumulation reminders into every step prompt -- guiding compliant agents to report outcome, commit SHAs, PR numbers, and diff stats at session completion. Without this field, captureConfidence is always 'none' and no session metrics are collected.",
679
- "procedure": [
680
- "Choose the correct profile based on what the workflow produces:",
681
- " - 'coding': produces code commits. Use for implementation, refactoring, bug-fix, migration, and documentation-writing workflows.",
682
- " - 'review': produces a review decision on a PR or MR. Use for code review, audit, and change validation workflows.",
683
- " - 'research': produces a finding or recommendation but no commits. Use for investigation, diagnosis, and analysis workflows.",
684
- " - 'design': produces a design artifact (pitch, spec, ADR, architecture doc) but no commits.",
685
- " - 'ticket': creates or updates work items in an external system (Jira, GitHub Issues, Linear).",
686
- " - 'none': meta-workflow, authoring tool, utility routine, or no measurable outcome. Set explicitly and document the reason.",
687
- "Add `\"metricsProfile\": \"<profile>\"` as a top-level field in the workflow JSON, after `recommendedPreferences` if that field exists.",
688
- "If choosing 'none', record the justification in your notes so the decision is auditable."
689
- ],
690
- "constraints": [
691
- "Do not invent a new profile value. The closed set is: 'coding', 'review', 'research', 'design', 'ticket', 'none'.",
692
- "The engine does NOT derive the profile from tags automatically. You must set it explicitly.",
693
- "workflow-for-workflows itself produces a workflow JSON artifact (a design output) -- use 'design'. It does not commit code."
694
- ],
695
- "outputRequired": {
696
- "notesMarkdown": "State the chosen metricsProfile and a one-line justification. If omitting, explain why."
697
- }
698
- },
699
- "requireConfirmation": false
700
- },
701
- {
702
- "id": "phase-7-final-trust-handoff",
703
- "title": "Phase 7: Final Trust Handoff",
704
- "promptBlocks": {
705
- "goal": "Summarize the authored or modernized workflow as a trust decision, not just a file edit.",
706
- "constraints": [
707
- "Keep it concise. The workflow file is the deliverable, not the summary."
708
- ],
709
- "procedure": [
710
- "Stamp the workflow file: read the current `version` from `spec/authoring-spec.json` and write `validatedAgainstSpecVersion: <N>` as a top-level field in the workflow JSON. Commit the change -- the stamp has no effect if only saved locally.",
711
- "State the workflow file path and name, whether it was created or modernized, and what it does in one sentence.",
712
- "Summarize the step structure, loops, confirmations, and delegation profile.",
713
- "Report validation status, authoring-integrity status, and outcome-effectiveness status.",
714
- "Set a final `workflowReadinessVerdict`: `ready`, `ready_with_conditions`, or `not_ready`.",
715
- "List the main improvements, residual weaknesses, trust risks if any, and how to test the workflow."
716
- ],
717
- "outputRequired": {
718
- "notesMarkdown": "Final trust handoff covering readiness verdict, validation status, strengths, residual weaknesses, and testing guidance.",
719
- "context": "Capture workflowReadinessVerdict, trustRiskSummary, knownFailureModes, and residualWeaknesses."
720
- },
721
- "verify": [
722
- "The final handoff makes clear whether WorkRail should trust this workflow."
723
- ]
724
- },
725
- "notesOptional": true,
726
- "requireConfirmation": false
727
- }
728
- ],
729
- "validatedAgainstSpecVersion": 3,
730
- "assessments": [
731
- {
732
- "id": "state-economy-gate",
733
- "purpose": "Every context field in the authored workflow earns its keep: it is set, consumed, and influences a concrete decision or output.",
734
- "dimensions": [
735
- {
736
- "id": "state_economy",
737
- "purpose": "All context fields have a traceable downstream use. No field is captured speculatively or carried without purpose.",
738
- "levels": [
739
- "low",
740
- "high"
741
- ]
742
- }
743
- ]
744
- },
745
- {
746
- "id": "simulation-outcome-gate",
747
- "purpose": "A concrete execution simulation has been completed and identified weak steps, likely unsatisfying outputs, and false-confidence modes -- and any found were fixed inline.",
748
- "dimensions": [
749
- {
750
- "id": "simulation_outcome",
751
- "purpose": "Simulation completed over at least one happy path and one degraded/edge-case path. Issues found were fixed or explicitly accepted.",
752
- "levels": [
753
- "low",
754
- "high"
755
- ]
756
- }
757
- ]
758
- },
759
- {
760
- "id": "authoring-integrity-gate",
761
- "purpose": "The authored workflow is structurally sound and meets authoring quality standards: voice, ceremony, loops, delegation, and state are all acceptable.",
762
- "dimensions": [
763
- {
764
- "id": "authoring_integrity",
765
- "purpose": "Structural and authoring-quality dimensions passed the adversarial review. No hard gate failure on voice, ceremony, loopSoundness, delegationBoundedness, or stateMinimality.",
766
- "levels": [
767
- "low",
768
- "high"
769
- ]
770
- }
771
- ]
772
- },
773
- {
774
- "id": "outcome-effectiveness-gate",
775
- "purpose": "The authored workflow is likely to produce genuinely satisfying results: task effectiveness, false-confidence resistance, coverage sharpness, domain fit, and handoff utility are all acceptable.",
776
- "dimensions": [
777
- {
778
- "id": "outcome_effectiveness",
779
- "purpose": "The workflow passes adversarial review on task effectiveness, falseConfidenceResistance, coverageSharpness, domainFit, and handoffUtility. No hard gate failure on any outcome dimension.",
780
- "levels": [
781
- "low",
782
- "high"
783
- ]
784
- }
785
- ]
786
- }
787
- ]
788
- }