workflow-supervisor 0.1.3 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +139 -0
- package/README.md +125 -28
- package/bin/workflow-skills.mjs +201 -1
- package/docs/artifacts.md +9 -0
- package/docs/cli.md +3 -1
- package/docs/portable-delegation.md +19 -1
- package/docs/skill-reference.md +12 -2
- package/docs/troubleshooting.md +34 -0
- package/package.json +8 -2
- package/schemas/dossier-v1.schema.json +38 -0
- package/schemas/worker-report-v1.schema.json +120 -12
- package/skills/acceptance-matrix/SKILL.md +114 -2
- package/skills/acceptance-matrix/agents/openai.yaml +1 -1
- package/skills/dossier-builder/SKILL.md +28 -0
- package/skills/loop-policy/SKILL.md +29 -6
- package/skills/work-unit/SKILL.md +46 -6
- package/skills/workflow-docs/SKILL.md +2 -1
- package/skills/workflow-docs/references/workflow-control.md +93 -6
- package/skills/workflow-supervisor/SKILL.md +195 -46
- package/skills/workflow-supervisor/agents/openai.yaml +2 -2
|
@@ -56,6 +56,39 @@ Stale Artifacts Invalidated:
|
|
|
56
56
|
## Next Action
|
|
57
57
|
```
|
|
58
58
|
|
|
59
|
+
## LEDGER.md
|
|
60
|
+
|
|
61
|
+
Use this for `lean_work_unit_runner` when the backlog is already bounded and the workflow needs high throughput with human-verifiable state.
|
|
62
|
+
|
|
63
|
+
```md
|
|
64
|
+
# Lean Work Unit Ledger
|
|
65
|
+
|
|
66
|
+
Profile: lean_work_unit_runner
|
|
67
|
+
Execution Path:
|
|
68
|
+
Mode:
|
|
69
|
+
Delegation:
|
|
70
|
+
Final Disposition:
|
|
71
|
+
Batch Checkpoint:
|
|
72
|
+
|
|
73
|
+
## Scope Contract
|
|
74
|
+
|
|
75
|
+
Objective:
|
|
76
|
+
Controlling Backlog Or Source:
|
|
77
|
+
Allowed Surfaces:
|
|
78
|
+
Forbidden Surfaces:
|
|
79
|
+
Escalation Triggers:
|
|
80
|
+
|
|
81
|
+
## Units
|
|
82
|
+
|
|
83
|
+
| ID | Source Ref | Slice Type | Scope | Observable Behavior | Done Signal | Check | Status | Touched Surfaces | Evidence | Blocker Or Next Action |
|
|
84
|
+
|---|---|---|---|---|---|---|---|---|---|---|
|
|
85
|
+
|
|
86
|
+
## Batch Checkpoints
|
|
87
|
+
|
|
88
|
+
| Batch | Units | Result | Checks | Human Review Needed | Next Action |
|
|
89
|
+
|---|---|---|---|---|---|
|
|
90
|
+
```
|
|
91
|
+
|
|
59
92
|
## SOURCE-CORPUS.md
|
|
60
93
|
|
|
61
94
|
```md
|
|
@@ -159,8 +192,20 @@ Notes:
|
|
|
159
192
|
```md
|
|
160
193
|
# Work Units
|
|
161
194
|
|
|
162
|
-
| ID | Worker Slug | Title |
|
|
163
|
-
|
|
195
|
+
| ID | Worker Slug | Title | Slice Type | Observable Behavior | Expected Outcome | Demo Or Verification | Dependencies | Status | Verification |
|
|
196
|
+
|---|---|---|---|---|---|---|---|---|---|
|
|
197
|
+
|
|
198
|
+
## Unit Slice Details
|
|
199
|
+
|
|
200
|
+
For each unit, record:
|
|
201
|
+
|
|
202
|
+
id:
|
|
203
|
+
slice_type: tracer_bullet | prefactor | migration | research | document | risk_boundary
|
|
204
|
+
observable_behavior:
|
|
205
|
+
expected_outcome:
|
|
206
|
+
demo_or_verification:
|
|
207
|
+
layers_touched:
|
|
208
|
+
horizontal_slice_justification:
|
|
164
209
|
|
|
165
210
|
## Sequencing
|
|
166
211
|
|
|
@@ -206,6 +251,16 @@ Notes:
|
|
|
206
251
|
|
|
207
252
|
## Quality Or Risk Checks
|
|
208
253
|
|
|
254
|
+
## Feedback Loop
|
|
255
|
+
|
|
256
|
+
feedback_loop:
|
|
257
|
+
command_or_evidence:
|
|
258
|
+
red_capable: yes | no | not_applicable
|
|
259
|
+
exact_symptom_or_behavior:
|
|
260
|
+
deterministic: yes | no
|
|
261
|
+
expected_runtime:
|
|
262
|
+
agent_runnable: yes | no
|
|
263
|
+
|
|
209
264
|
## Required Checks Or Evidence
|
|
210
265
|
|
|
211
266
|
## Owner Or Contributor Role
|
|
@@ -226,14 +281,16 @@ Notes:
|
|
|
226
281
|
```md
|
|
227
282
|
# Worker Map
|
|
228
283
|
|
|
229
|
-
| Worker Name | Role | Transport | Work Unit | Dossier | Start Condition | Dependencies | Status | Terminal Report |
|
|
230
|
-
|
|
284
|
+
| Worker Name | Role | Transport | Native Resource ID | Work Unit | Dossier | Start Condition | Dependencies | Status | Terminal Report | Close Action | Close Result |
|
|
285
|
+
|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
231
286
|
|
|
232
287
|
## Supervisor Checkpoints
|
|
233
288
|
|
|
234
289
|
## Blocked Workers
|
|
235
290
|
|
|
236
291
|
## Closed Workers
|
|
292
|
+
|
|
293
|
+
Closed means the terminal report has been consumed and any native thread or subagent resource has a recorded close result. For Codex subagents, record the `spawn_agent` id as Native Resource ID and `close_agent` as Close Action. A workflow with open native workers must remain BLOCKED until the close result is recorded.
|
|
237
294
|
```
|
|
238
295
|
|
|
239
296
|
## ACCEPTANCE-MATRIX.md
|
|
@@ -241,12 +298,32 @@ Notes:
|
|
|
241
298
|
```md
|
|
242
299
|
# Acceptance Matrix
|
|
243
300
|
|
|
244
|
-
|
|
245
|
-
|
|
301
|
+
## Verification Environment
|
|
302
|
+
|
|
303
|
+
| Capability | Available | Notes |
|
|
304
|
+
|---|---|---|
|
|
305
|
+
| shell | | |
|
|
306
|
+
| filesystem | | |
|
|
307
|
+
| git_diff | | |
|
|
308
|
+
| browser | | |
|
|
309
|
+
| playwright_mcp | | |
|
|
310
|
+
| network | | |
|
|
311
|
+
|
|
312
|
+
## Outcome Evaluation Matrix
|
|
313
|
+
|
|
314
|
+
| ID | Source Requirement | Expected Outcome | Preferred Verification | Available Verification | Evidence Strength | Invalid PASS Conditions | Verdict | Evidence | Limitation |
|
|
315
|
+
|---|---|---|---|---|---|---|---|---|---|
|
|
316
|
+
|
|
317
|
+
## Acceptance Rows
|
|
318
|
+
|
|
319
|
+
| ID | Requirement | Evidence Required | Verification Method | Feedback Loop | Evidence Classification | Adversarial Check | Status | Evidence |
|
|
320
|
+
|---|---|---|---|---|---|---|---|---|
|
|
246
321
|
|
|
247
322
|
## Residual Risks
|
|
248
323
|
|
|
249
324
|
## Waivers
|
|
325
|
+
|
|
326
|
+
## Verification Findings
|
|
250
327
|
```
|
|
251
328
|
|
|
252
329
|
## VERIFICATION-REPORT.md
|
|
@@ -269,6 +346,16 @@ Verified Worker:
|
|
|
269
346
|
| Method | Result | Evidence |
|
|
270
347
|
|---|---|---|
|
|
271
348
|
|
|
349
|
+
## Verification Environment
|
|
350
|
+
|
|
351
|
+
| Capability | Available | Notes |
|
|
352
|
+
|---|---|---|
|
|
353
|
+
|
|
354
|
+
## Outcome Evaluations
|
|
355
|
+
|
|
356
|
+
| Row | Source Requirement | Expected Outcome | Verdict | Evidence Strength | Evidence | Limitation | Required External Check |
|
|
357
|
+
|---|---|---|---|---|---|---|---|
|
|
358
|
+
|
|
272
359
|
## Acceptance Mapping
|
|
273
360
|
|
|
274
361
|
| Requirement | Verdict | Evidence |
|
|
@@ -1,15 +1,99 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: workflow-supervisor
|
|
3
|
-
description: Coordinate supervised
|
|
3
|
+
description: Coordinate supervised workflows with profile-based overhead. Trigger whenever the user explicitly invokes workflow-supervisor, $workflow-supervisor, supervised workflow, lean work-unit runner, dossiers, work units, worker agents, handoffs, approval gates, durable resume, or workflow-state documentation. Route first before profile selection. If not explicitly invoked and the work is a small clear edit with obvious files and acceptance, do not invoke Workflow Supervisor. Execute directly. When explicitly invoked, first select the correct profile: lean_work_unit_runner for large already-bounded backlogs or low-footprint direct execution, strict_full_workflow for ambiguous/high-risk/source-of-truth/delegated work, or planning_only for intake and sequencing. Do not run strict ceremony just because the skill was named. When not explicitly invoked, use only for workflows with hard supervisor triggers such as multi-agent handoff, durable resume, high-risk verification, contradictory or missing sources, multi-unit scope, repair loops, approval gates, or workflow-state documentation.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Workflow Supervisor
|
|
7
7
|
|
|
8
|
-
Use this skill as the coordinating spine for supervised
|
|
8
|
+
Use this skill as the coordinating spine for supervised work. The supervisor owns decomposition, execution profile selection, loop discipline, stop gates, optional worker-agent handoff quality, and outcome reporting. It may do source discovery, implementation, focused verification, and reporting itself in lean mode. In strict mode, implementation, verification, repair-ticket writing, and documentation must be treated as separate worker-agent responsibilities when an automated worker path is available. Native threads, subagents, or the portable delegate command are transports for those worker agents.
|
|
9
9
|
|
|
10
|
-
##
|
|
10
|
+
## Route First
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
Before profile selection, decide whether this is supervisor work at all.
|
|
13
|
+
|
|
14
|
+
If Workflow Supervisor was not explicitly invoked and the task is a small, clear edit with obvious files and acceptance, do not invoke this skill. Execute directly with normal repository inspection and the relevant check.
|
|
15
|
+
|
|
16
|
+
When Workflow Supervisor is explicitly invoked, do not silently skip it. Select the proportional profile and keep the overhead as small as that profile allows.
|
|
17
|
+
|
|
18
|
+
| Situation | Route |
|
|
19
|
+
|---|---|
|
|
20
|
+
| Small, clear edit with obvious files and acceptance | Do not use Workflow Supervisor. Execute directly. |
|
|
21
|
+
| Large bounded backlog with clear unit done signals | `lean_work_unit_runner`. |
|
|
22
|
+
| Broad, ambiguous, source-of-truth, delegated, security-sensitive, dirty-state, release, resume, or externally published work | `strict_full_workflow`. |
|
|
23
|
+
| Sequencing, risk review, or backlog shaping only | `planning_only`. |
|
|
24
|
+
| Runnable uncertainty before implementation | Create a discovery or prototype unit first. |
|
|
25
|
+
|
|
26
|
+
## Execution Profiles
|
|
27
|
+
|
|
28
|
+
When the user explicitly invokes `workflow-supervisor`, `$workflow-supervisor`, or says to use this skill, first classify the workflow profile before creating heavy artifacts, goals, worker plans, dossiers, or subagents.
|
|
29
|
+
|
|
30
|
+
Use `lean_work_unit_runner` when the source already contains bounded work units, tickets, issues, checklist rows, or backlog entries and the user's priority is throughput, low memory, direct execution, or many pure units. Lean mode is for "do the next unit" execution, not for interpreting a broad source of truth from scratch.
|
|
31
|
+
|
|
32
|
+
Use `strict_full_workflow` when the task is ambiguous, high-risk, source-of-truth driven, regulated, delegated to multiple workers, externally published, security-sensitive, cross-system, or missing clear work-unit boundaries.
|
|
33
|
+
|
|
34
|
+
Use `planning_only` when the user wants intake, backlog shaping, sequencing, risk review, or a plan without implementation.
|
|
35
|
+
|
|
36
|
+
If the profile is unclear after reading the user's request and controlling source, ask one profile question and stop. Do not default to strict ceremony merely because the skill was named.
|
|
37
|
+
|
|
38
|
+
### Lean Work Unit Runner
|
|
39
|
+
|
|
40
|
+
Lean mode optimizes for large-unit throughput while preserving non-ambiguity and human-verifiable state. It keeps work units as the backbone and removes per-unit ceremony that does not directly improve execution.
|
|
41
|
+
|
|
42
|
+
Lean mode requires exactly one upfront scope contract before the first unit starts:
|
|
43
|
+
|
|
44
|
+
- objective and controlling backlog/source
|
|
45
|
+
- selected profile: `lean_work_unit_runner`
|
|
46
|
+
- execution path: `autonomous_goal` or `human_in_loop`
|
|
47
|
+
- execution mode: usually `sequential`; parallel only for proven disjoint surfaces
|
|
48
|
+
- delegation: default `same_session_phased`; workers/subagents only with explicit authorization for a specific batch or risk
|
|
49
|
+
- final disposition and mutation boundaries
|
|
50
|
+
- state medium: one compact ledger, usually inline or `.workflow/LEDGER.md`
|
|
51
|
+
- batch size or checkpoint cadence for human review
|
|
52
|
+
|
|
53
|
+
Lean mode requires a backlog where each executable unit has:
|
|
54
|
+
|
|
55
|
+
```yaml
|
|
56
|
+
id:
|
|
57
|
+
source_ref:
|
|
58
|
+
slice_type:
|
|
59
|
+
scope:
|
|
60
|
+
observable_behavior:
|
|
61
|
+
done:
|
|
62
|
+
check:
|
|
63
|
+
status: pending | active | pass | fail | blocked | escalated
|
|
64
|
+
notes:
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Do not start a lean unit unless its boundary and done signal are clear. If a unit lacks `scope`, `done`, or `check`, mark it `blocked` and ask for the smallest missing decision, or split it into smaller units. Do not hide ambiguity in notes.
|
|
68
|
+
|
|
69
|
+
For product or integration behavior, prefer tracer-bullet units that expose one observable behavior across the smallest useful set of layers. A product implementation unit must name `observable_behavior`, `expected_outcome`, and `demo_or_verification`, or explicitly use a non-product `slice_type` such as `prefactor`, `migration`, `research`, `document`, or `risk_boundary` with a `horizontal_slice_justification`.
|
|
70
|
+
|
|
71
|
+
Lean per-unit loop:
|
|
72
|
+
|
|
73
|
+
1. Select the next ready unit from the ledger.
|
|
74
|
+
2. Inspect only the files, artifacts, or source slices needed for that unit.
|
|
75
|
+
3. Apply the smallest implementation that satisfies the unit.
|
|
76
|
+
4. Run the unit's targeted check, or record the exact reason the check is blocked.
|
|
77
|
+
5. Update one ledger row with status, touched surfaces, check result, and residual risk if any.
|
|
78
|
+
6. Continue to the next ready unit until the batch checkpoint, blocker, resource gate, or final disposition.
|
|
79
|
+
|
|
80
|
+
Lean mode must not create per-unit SPEC files, full dossiers, worker maps, repair-ticket documents, or documenter passes by default. Use inline unit contracts and one compact ledger. Batch documentation and outcome reporting at checkpoints or final closeout.
|
|
81
|
+
|
|
82
|
+
Escalate a lean unit to `strict_full_workflow` or pause for human review when:
|
|
83
|
+
|
|
84
|
+
- source requirements conflict or are materially incomplete
|
|
85
|
+
- the unit touches broad architecture, security, data loss, credentials, production systems, billing, legal/compliance, public API contracts, migrations, or destructive operations
|
|
86
|
+
- the unit cannot name a targeted check or human-inspectable evidence
|
|
87
|
+
- multiple units unexpectedly touch the same shared surface and need re-sequencing
|
|
88
|
+
- repair repeats without new evidence
|
|
89
|
+
- the user asks for independent verification, subagents, PR, deploy, publish, or external-service action
|
|
90
|
+
- memory, process count, broad scans, or context churn threatens execution throughput
|
|
91
|
+
|
|
92
|
+
Lean verification is proportional. Use `focused-check` for the unit or batch. Use `independent-verifier` only when a risk trigger or user instruction justifies the extra cost. A lean PASS requires the ledger to show every completed unit's source reference, done signal, check or substitute evidence, and touched surfaces.
|
|
93
|
+
|
|
94
|
+
For bug fixes and risky behavior changes, the focused check must be red-capable or explicitly waived. A red-capable loop catches the exact symptom or behavior, not merely a related build, lint, or broad test. If no correct test surface exists, record an architecture or verification finding instead of a quiet skipped check.
|
|
95
|
+
|
|
96
|
+
### Strict Full Workflow
|
|
13
97
|
|
|
14
98
|
Strict mode always requires:
|
|
15
99
|
|
|
@@ -21,14 +105,15 @@ Strict mode always requires:
|
|
|
21
105
|
6. At least one bounded work unit, even for a tiny change. Use `WU-001` when there is only one unit.
|
|
22
106
|
7. A dossier for each implementation work unit before implementation begins.
|
|
23
107
|
8. An acceptance matrix or acceptance draft with evidence expectations before implementation begins.
|
|
108
|
+
- For outcome-bearing work, the matrix must include expected user/system-visible outcomes, preferred and available verification capabilities, evidence strength, invalid PASS conditions, and row-level outcome verdicts.
|
|
24
109
|
9. A worker-agent plan with implementer, verifier, repair-author, and documenter agents.
|
|
25
|
-
10. A worker lifecycle record using `planned -> handed_off -> acknowledged -> reported -> verified -> closed`.
|
|
110
|
+
10. A worker lifecycle record using `planned -> handed_off -> acknowledged -> reported -> verified -> resource_closed -> closed`.
|
|
26
111
|
11. Verification labeled as `self-check`, `focused-check`, or `independent-verifier`.
|
|
27
112
|
12. A final disposition question or recorded completed-intake final disposition after verification.
|
|
28
113
|
|
|
29
|
-
Worker agents are mandatory when the environment provides worker, subagent, thread, or portable delegation tools. The supervisor must hand off implementation, verification, repair-authoring when needed, and documentation to separate agents with scoped dossiers and the required report schema. Run worker agents sequentially by default unless completed intake explicitly authorizes parallelism.
|
|
114
|
+
Worker agents are mandatory when strict mode is selected and the environment provides worker, subagent, thread, or portable delegation tools with a complete lifecycle: start, terminal report collection, and resource close. The supervisor must hand off implementation, verification, repair-authoring when needed, and documentation to separate agents with scoped dossiers and the required report schema. Run worker agents sequentially by default unless completed intake explicitly authorizes parallelism.
|
|
30
115
|
|
|
31
|
-
If the environment cannot create, message,
|
|
116
|
+
If the environment cannot create, message, delegate to, and close worker agents, record `worker_agent_unavailable` or `worker_resource_close_unavailable` and stop for the human decision unless completed intake explicitly selected `same_session_phased`. Do not silently collapse worker agents into same-session work.
|
|
32
117
|
|
|
33
118
|
Do not nest supervisors recursively. A worker agent that receives a supervisor-scoped dossier must perform its assigned role instead of spawning another supervisor layer unless the parent supervisor explicitly asks for a child supervisor.
|
|
34
119
|
|
|
@@ -62,8 +147,12 @@ Treat roadmap phases, source "Build" lists, and exit criteria as material when t
|
|
|
62
147
|
|
|
63
148
|
Create exactly one implementation work unit only when all current-scope material requirements can be implemented and verified inside that one unit without hiding source requirements in residual risks, skipped checks, future work, or next recommended actions. For multi-phase, dependency-heavy, or roadmap-driven work, create one work unit per independently verifiable phase, integration, data slice, or risk boundary.
|
|
64
149
|
|
|
150
|
+
For user-facing behavior or integration behavior, make work units tracer-bullet shaped by default. Horizontal units are valid only for prefactoring, migration safety, infrastructure, documentation, research, or a dependency that cannot yet be verified as behavior, and they must include a horizontal-slice justification.
|
|
151
|
+
|
|
65
152
|
Before final closeout, audit the coverage ledger. The workflow may be PASS only when every material requirement is mapped to a PASS acceptance row, explicitly waived by the user, or blocked and reported as not complete.
|
|
66
153
|
|
|
154
|
+
For outcome evaluation, treat the implementer report as a claim, not truth. The required chain is source requirement -> acceptance row -> outcome evidence -> verifier verdict -> supervisor audit. Tests, typecheck, lint, and build are only evidence types; they are not enough for material behavior unless the row is explicitly technical or the command observes the expected outcome.
|
|
155
|
+
|
|
67
156
|
## SPEC Review And Q&A Gate
|
|
68
157
|
|
|
69
158
|
Before final work units, create a concise reviewable spec as `.workflow/SPEC.md` when workflow docs are enabled, or as an inline SPEC review packet when state is inline. The SPEC is the human-readable contract for interpretation, not the execution plan.
|
|
@@ -141,12 +230,18 @@ When the human answers:
|
|
|
141
230
|
- Run the complete intake gate before goal creation, worker delegation, implementation, publication, or other irreversible action.
|
|
142
231
|
- Do not infer execution path, mode, delegation, final disposition, or boundaries from keywords, action verbs, or intent guesses.
|
|
143
232
|
- Classify the workflow as `autonomous_goal` or `human_in_loop` only from completed intake answers before delegating workers or beginning implementation.
|
|
144
|
-
- Explicit invocation always requires complete intake, work units, dossiers, worker-agent contracts, scoped handoffs, report schema, and verification
|
|
233
|
+
- Explicit invocation always requires profile selection before heavy planning. `strict_full_workflow` requires complete intake, work units, dossiers, worker-agent contracts, scoped handoffs, report schema, and verification. `lean_work_unit_runner` requires an upfront scope contract, bounded work units, a compact ledger, focused checks, and escalation gates instead of full per-unit ceremony.
|
|
145
234
|
- Preserve source-scope fidelity: do not translate controlling-source requirements into weaker proxy checks unless the user explicitly approves the narrower scope or waiver.
|
|
146
235
|
- Always produce a plan after complete intake. In `human_in_loop`, make it an approval packet and stop for approval. In `autonomous_goal`, make it an execution plan and continue only when the completed intake authorizes that path.
|
|
147
|
-
- Do not begin implementation until complete intake and the path gate are satisfied, at least one work unit exists, at least one concrete dossier exists, worker-agent contracts exist, and no stop gate applies.
|
|
236
|
+
- Do not begin strict implementation until complete intake and the path gate are satisfied, at least one work unit exists, at least one concrete dossier exists, worker-agent contracts exist, and no stop gate applies.
|
|
237
|
+
- Do not begin lean implementation until the scope contract is recorded, the backlog contains at least one ready unit, the compact ledger exists or can be kept inline, the current unit has source reference, scope, done signal, and check, and no escalation gate applies.
|
|
238
|
+
- Do not begin product or integration implementation from a vague horizontal phase. Prefer a tracer-bullet unit with observable behavior and demo or verification; allow horizontal units only for prefactoring, migration, infrastructure, documentation, research, or risk-boundary work with a justification.
|
|
239
|
+
- Do not mark a bug fix or risky behavior change PASS unless acceptance rows name a red-capable feedback loop, or the user explicitly accepts substitute evidence.
|
|
240
|
+
- Do not mark outcome-bearing work PASS unless every material outcome row has fully observed evidence, or the user explicitly waives/narrows the missing proof. Row-level `CONDITIONAL_PASS` means strongly inferred but not fully observable; it is not a green final workflow status.
|
|
241
|
+
- When a verification capability is unavailable, record the capability limitation and required external check instead of pretending the outcome was observed. Browser snapshots, visual diffs, live services, credentials, and human reviews are verifier adapters, not universal requirements.
|
|
148
242
|
- Delegate workers only through an automated supported delegation transport after complete intake and the path gate authorize delegation. If no supported transport exists, use same-session phased mode only when intake allowed it; otherwise stop as `worker_agent_unavailable`.
|
|
149
243
|
- Do not start implementer, verifier, repair-author, or documenter workers before complete intake and the path gate are satisfied; role-specific start conditions are additional gates after that.
|
|
244
|
+
- Do not use native thread or native subagent workers unless the environment exposes a close operation for that transport. For Codex subagents, the supervisor must call `close_agent` for every `spawn_agent` id after the worker reaches a terminal report, times out, blocks, fails validation, is cancelled, or is no longer needed.
|
|
150
245
|
- Keep roles separate: implementers implement, verifiers verify, repair authors write tickets, documenters update workflow artifacts, and the supervisor coordinates.
|
|
151
246
|
- Treat same-session verification as a self-check, not independent verification. Separate verifier-agent verification may be labeled `independent-verifier` only when genuinely performed by a separate worker agent or thread.
|
|
152
247
|
- Prefer explicit PASS/FAIL/BLOCKED states over soft completion language.
|
|
@@ -165,11 +260,26 @@ Treat these as distinct mechanisms:
|
|
|
165
260
|
- Skill: reusable instructions loaded into the current agent.
|
|
166
261
|
- Worker: a role-scoped automated execution run that receives one dossier and returns one terminal report.
|
|
167
262
|
- Portable worker delegation: the package helper command, `workflow-supervisor delegate --agent <agent> --role <role> --unit <unit-id> --cwd <workspace> --dossier <path>`, which invokes an installed platform CLI and normalizes its report.
|
|
168
|
-
- Native thread or subagent: an environment-specific transport a worker adapter may use when it
|
|
263
|
+
- Native thread or subagent: an environment-specific transport a worker adapter may use only when it can also close the native worker resource after use.
|
|
169
264
|
- Same-session phased mode: the current agent performs roles sequentially. Verification in this mode is a self-check, not independent verification.
|
|
170
265
|
|
|
171
266
|
Start workers only after complete intake and the path gate are satisfied, at least one work unit exists, a concrete dossier exists, the loop policy authorizes delegation, and the environment exposes an automated supported transport. If environment rules require explicit user approval for user-visible native thread creation, obtain it before using that transport. Do not use manual copy/paste handoff as the primary path. If automated delegation is unavailable, mark the unit `worker_agent_unavailable` unless completed intake explicitly selected same-session phased work.
|
|
172
267
|
|
|
268
|
+
### Native Worker Resource Lifecycle
|
|
269
|
+
|
|
270
|
+
Logical worker completion is not enough for native thread or subagent transports. A worker is not `closed` until its native resource has also been released.
|
|
271
|
+
|
|
272
|
+
For every native worker:
|
|
273
|
+
|
|
274
|
+
1. Record the native resource id immediately after creation, such as the Codex `agent_id` returned by `spawn_agent`, in the worker map.
|
|
275
|
+
2. Record transport, worker name, role, work unit, dossier, start time, and close requirement before waiting on the worker.
|
|
276
|
+
3. Collect one terminal report or mark the worker `BLOCKED` because of timeout, invalid output, unavailable adapter, cancellation, or missing evidence.
|
|
277
|
+
4. Call the native close operation as soon as the terminal report or blocker is captured. For Codex subagents, call `close_agent` with the recorded `agent_id`.
|
|
278
|
+
5. Record the close result and previous native status. Only then move the worker to `resource_closed` and then `closed`.
|
|
279
|
+
6. Before final workflow outcome, audit the worker map. If any native worker has no close result, final status is `BLOCKED` with reason `open_native_worker`.
|
|
280
|
+
|
|
281
|
+
Native worker ids are resource handles, not evidence. Do not use a completed worker report, a subagent notification, or a wait result as a substitute for closing the native worker. If the close operation fails or is unavailable, stop and report `worker_resource_close_failed` or `worker_resource_close_unavailable`; do not keep spawning replacement workers.
|
|
282
|
+
|
|
173
283
|
## Worker Report Schema
|
|
174
284
|
|
|
175
285
|
Every worker report back to the supervisor must use this schema:
|
|
@@ -188,6 +298,8 @@ skipped_checks:
|
|
|
188
298
|
blockers:
|
|
189
299
|
residual_risks:
|
|
190
300
|
next_recommended_action:
|
|
301
|
+
verification_environment:
|
|
302
|
+
outcome_evaluations:
|
|
191
303
|
```
|
|
192
304
|
|
|
193
305
|
Implementers may edit only allowed surfaces from the dossier. Verifiers must not edit. Repair authors write repair tickets from failed acceptance rows and must not expand scope. Documenters update only approved workflow or documentation surfaces after source, implementation, verification, or repair evidence exists.
|
|
@@ -201,6 +313,7 @@ Do not use keywords to skip intake. Words such as "autonomous", "agent loop", "w
|
|
|
201
313
|
Required intake decisions:
|
|
202
314
|
|
|
203
315
|
- Objective and source: what artifact, spec, repo path, document, ticket, or source set controls the work.
|
|
316
|
+
- Profile: `lean_work_unit_runner`, `strict_full_workflow`, or `planning_only`.
|
|
204
317
|
- Execution path: `autonomous_goal` or `human_in_loop`.
|
|
205
318
|
- Execution mode: `sequential`, `parallel_where_safe`, or `staged_parallel`.
|
|
206
319
|
- Delegation: `automated_worker_delegation`, `native_threads_or_subagents_if_available`, or `same_session_phased`.
|
|
@@ -215,53 +328,72 @@ Use this question shape for the first intake ask:
|
|
|
215
328
|
```text
|
|
216
329
|
Before I start the supervisor loop, answer every intake item:
|
|
217
330
|
1. Objective and source: what artifact, spec, repo path, document, ticket, or source set controls the work?
|
|
218
|
-
2.
|
|
219
|
-
3.
|
|
220
|
-
4.
|
|
221
|
-
5.
|
|
222
|
-
6.
|
|
223
|
-
7.
|
|
331
|
+
2. Profile: lean_work_unit_runner, strict_full_workflow, or planning_only?
|
|
332
|
+
3. Execution path: autonomous_goal or human_in_loop?
|
|
333
|
+
4. Mode: sequential, parallel where safe, or staged parallel?
|
|
334
|
+
5. Delegation: same-session phased, automated worker delegation, or native threads/subagents if available?
|
|
335
|
+
6. Final disposition: keep local, open PR, push main, deploy/publish, or ask at the end?
|
|
336
|
+
7. Boundaries: may I install dependencies, call external services, use credentials, or only edit local files?
|
|
337
|
+
8. State artifacts: compact ledger, `.workflow/` docs, another artifact directory, or inline state?
|
|
224
338
|
```
|
|
225
339
|
|
|
226
340
|
If the user answers only some intake items, ask only the unanswered or ambiguous item(s) again and stop. If the user says "use your judgment", treat that item as unanswered; do not substitute defaults. Continue prompting until every required intake decision has an explicit user answer.
|
|
227
341
|
|
|
228
|
-
Treat `autonomous_goal`, PR creation, direct push, deploy, publication, paid operations, production data changes, and credential use as satisfied only by completed intake answers, not by keywords elsewhere in the prompt.
|
|
342
|
+
Treat `strict_full_workflow`, `autonomous_goal`, PR creation, direct push, deploy, publication, paid operations, production data changes, and credential use as satisfied only by completed intake answers or an explicit profile selection, not by vague keywords elsewhere in the prompt.
|
|
229
343
|
|
|
230
344
|
Negative example: "Using Workflow Supervisor, generate an API and create the project" is not autonomous authorization and is not complete intake. It names the supervisor and objective, but leaves required intake decisions unresolved. Ask the complete intake packet and stop before implementation.
|
|
231
345
|
|
|
232
346
|
## Supervisor Loop
|
|
233
347
|
|
|
234
348
|
1. Run the complete intake gate. Record explicit user answers. If any required intake answer is missing, vague, or delegated to judgment, ask for the unresolved item(s) and stop.
|
|
235
|
-
2. Restate the objective, constraints, non-goals, known sources, and unknowns from the completed intake.
|
|
349
|
+
2. Restate the selected profile, objective, constraints, non-goals, known sources, and unknowns from the completed intake.
|
|
236
350
|
3. Bind or reconcile the Codex goal only after complete intake and only when no unrelated active goal prevents binding.
|
|
237
|
-
4.
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
|
|
351
|
+
4. If the profile is `lean_work_unit_runner`, run the lean loop:
|
|
352
|
+
- Confirm the source contains bounded work units or create a short upfront backlog contract. If not possible, pause for a decision or switch to `planning_only` or `strict_full_workflow`.
|
|
353
|
+
- Create or select one compact ledger instead of full workflow docs.
|
|
354
|
+
- Verify each ready unit has `id`, `source_ref`, `slice_type`, `scope`, `done`, `check`, and `status`; product or integration units also need `observable_behavior`, `expected_outcome`, and `demo_or_verification`.
|
|
355
|
+
- Present a concise batch plan in `human_in_loop`, or continue in `autonomous_goal` when intake permits it.
|
|
356
|
+
- Execute one unit at a time with targeted inspection, smallest patch, focused check, ledger update, and checkpoint cadence.
|
|
357
|
+
- Escalate only the affected unit or batch when a strict-mode trigger appears; do not convert the whole backlog to strict mode unless the source contract is invalid.
|
|
358
|
+
- Finish with a compact outcome naming units completed, blocked, failed, escalated, checks, skipped checks, residual risks, and final disposition.
|
|
359
|
+
5. If the profile is `planning_only`, stop after source grounding, backlog shape, risks, recommended profile, and approval questions. Do not implement.
|
|
360
|
+
6. If the profile is `strict_full_workflow`, continue the strict loop below.
|
|
361
|
+
7. Build or request a source corpus map. Use `$source-corpus` when source authority, freshness, or contradictions matter.
|
|
362
|
+
8. Create the source-requirement coverage ledger. If any material source requirement cannot be classified, mapped to work, or explicitly deferred, stop and ask for the missing scope decision.
|
|
363
|
+
9. Create the SPEC review packet or `.workflow/SPEC.md` from the source corpus and coverage ledger.
|
|
364
|
+
10. Run the SPEC Q&A gate. In `human_in_loop`, stop until the human asks questions, receives answers or revisions, and explicitly approves the SPEC. In `autonomous_goal`, continue only when no blocking questions remain and approval is not required by intake.
|
|
365
|
+
11. Split the objective into bounded work units from the approved or non-blocked SPEC and coverage ledger. Use `$work-unit` for ambiguous, multi-phase, product, or integration goals. Prefer tracer-bullet units for user-facing or integration behavior. If the task is tiny and the ledger has no deferred material requirements, create exactly one work unit named `WU-001`.
|
|
366
|
+
12. Choose a loop policy before starting work: sequential or parallel, retry limits, approval gates, budgets, goal update cadence, and blocker rules. Use `$loop-policy` when the policy is not obvious.
|
|
367
|
+
13. Build dossiers for the first implementation units and any planned verification, repair, or documentation workers. Use `$dossier-builder` when delegating work to another agent or when the task has boundaries.
|
|
368
|
+
14. Assign worker roles with explicit allowed and forbidden behavior. Use `$worker-roles` for multi-agent, native-thread, or portable-worker work.
|
|
369
|
+
15. Select the execution path:
|
|
246
370
|
- `human_in_loop`: use when selected in completed intake or when a higher-priority rule requires human approval after intake.
|
|
247
371
|
- `autonomous_goal`: use only when selected in completed intake and no higher-priority rule requires human approval.
|
|
248
|
-
|
|
249
|
-
|
|
372
|
+
16. If `.workflow/` artifacts will be used in a Git-backed codebase, ensure `.gitignore` contains `.workflow/` before writing them.
|
|
373
|
+
17. Present the path-specific plan:
|
|
250
374
|
- `human_in_loop`: approval packet with plan, work units, worker delegation plan, approval gates, stop gates, and first dossiers. Stop until the human approves or revises it.
|
|
251
375
|
- `autonomous_goal`: execution plan with the same contents plus autonomous boundaries, allowed actions, stop gates, repair limits, and final disposition policy. Continue after recording it only when complete intake authorized that path.
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
376
|
+
18. After the path gate is satisfied, delegate named workers from the worker delegation plan through the selected automated transport. Send each worker only its role, dossier, sources, acceptance rows, stop gates, and report schema. For native threads or subagents, record the native resource id immediately and confirm a close operation exists before starting more workers.
|
|
377
|
+
19. Collect one terminal report from each worker. If a worker asks a human-facing question, convert it to `BLOCKED` and have the supervisor ask the user only when the path policy permits. For native threads or subagents, close the native resource after the report or blocker is captured.
|
|
378
|
+
20. Verify independently where possible. Use `$acceptance-matrix` to map every requirement to evidence. Start verifier workers only after the relevant implementer report is available.
|
|
379
|
+
- For outcome-bearing rows, record the verification environment and capability manifest before judging evidence.
|
|
380
|
+
- Compare what the work unit required, what the implementer changed, touched surfaces, forbidden surfaces, and whether the result is a no-op, placeholder, hardcoded fixture, test-only fake, or scope creep.
|
|
381
|
+
- Evaluate behavior, not just build health: run the feature or use case end to end when possible, call the API/CLI/UI route directly, inspect generated files/output/state, verify data/schema/contracts, check before/after behavior, and test negative or adversarial cases.
|
|
382
|
+
- If browser or visual proof is unavailable, use the strongest available lower-level observable contract such as jsdom render, server-rendered output, state-machine/view-model test, API probe, file snapshot, route manifest, or static semantic diff inspection. Mark any missing stronger proof as `CONDITIONAL_PASS` or BLOCKED at the row level.
|
|
383
|
+
- For bug fixes and risky behavior changes, require a feedback loop with `command_or_evidence`, `red_capable`, `exact_symptom_or_behavior`, `deterministic`, `expected_runtime`, and `agent_runnable`.
|
|
384
|
+
- Classify evidence as `behavior_was_tested`, `related_check_ran`, or `substitute_evidence_accepted`.
|
|
385
|
+
- Treat PASS without a behavior-catching loop as BLOCKED unless waiver evidence accepts substitute evidence.
|
|
386
|
+
21. If verification FAILs, convert findings into repair tickets and route them to a repair-author or implementer repair worker. Do not expand scope during repair.
|
|
387
|
+
22. Re-run verification after repairs. Continue only until PASS, BLOCKED, repair limit, or path stop.
|
|
388
|
+
23. Start documenter workers only after source, implementation, verification, or repair evidence exists, unless the documenter is explicitly creating planning state.
|
|
389
|
+
24. If verification BLOCKs, record the resume checkpoint, report the blocker, and stop or ask for the missing decision. When the human answers, use Resume After Human Decision.
|
|
390
|
+
25. Use `$workflow-docs` to create or refresh reusable Markdown artifacts under `<workspace>/.workflow/` when the workflow must persist across context loss, agents, or sessions.
|
|
391
|
+
26. Audit skipped checks, residual risks, future work, and next recommended actions against the source-requirement coverage ledger. If any material source requirement appears there without an explicit user deferral or waiver, mark the workflow FAIL/BLOCKED and create more work units or ask for a scope decision.
|
|
392
|
+
- Audit outcome rows separately. A row-level `CONDITIONAL_PASS` cannot be hidden inside a final PASS unless the final report names the limitation and an explicit user waiver accepts that limitation.
|
|
393
|
+
27. When all material acceptance rows are PASS or waived, apply the final disposition policy:
|
|
262
394
|
- `human_in_loop`: use the completed intake final disposition; if it is `ask_at_end`, ask the human to choose PR, push main, or keep local.
|
|
263
395
|
- `autonomous_goal`: use the completed intake final disposition. If it is `ask_at_end`, stop and ask before taking any final disposition action.
|
|
264
|
-
|
|
396
|
+
28. Finish with an outcome report that names profile, execution path, goal status, sources, SPEC decision when strict, coverage or ledger disposition, work units, delegated workers or same-session execution, native worker close status, checks, skipped checks, residual risks, final disposition decision, and next action.
|
|
265
397
|
|
|
266
398
|
## Execution Paths
|
|
267
399
|
|
|
@@ -269,7 +401,9 @@ Negative example: "Using Workflow Supervisor, generate an API and create the pro
|
|
|
269
401
|
|
|
270
402
|
Use `human_in_loop` when the completed intake selects it, or when a higher-priority rule requires human approval after intake. If the user has not answered the execution-path intake item, stop and ask for that answer instead of inferring a path.
|
|
271
403
|
|
|
272
|
-
|
|
404
|
+
In `lean_work_unit_runner`, the first review deliverable is the scope contract plus compact ledger and batch checkpoint policy. Stop for approval before the first batch unless the user explicitly selected autonomous execution.
|
|
405
|
+
|
|
406
|
+
In `strict_full_workflow`, the first review deliverable after source coverage is the SPEC review packet, not implementation. After the SPEC Q&A gate is approved, the supervisor presents the implementation approval packet. The approval packet must include:
|
|
273
407
|
|
|
274
408
|
- objective and non-goals
|
|
275
409
|
- source corpus summary and gaps
|
|
@@ -298,6 +432,8 @@ Use `autonomous_goal` only when the completed intake selects it. Phrases such as
|
|
|
298
432
|
- stop gates, repair limits, budgets, and escalation rules
|
|
299
433
|
- final disposition policy: `open_pr_when_green`, `push_main_when_green`, or `keep_local_when_green`
|
|
300
434
|
|
|
435
|
+
For `lean_work_unit_runner`, replace SPEC, coverage-ledger, and worker-delegation details with the scope contract, compact ledger path, unit readiness fields, batch checkpoint cadence, resource gates, focused-check policy, and strict-mode escalation triggers.
|
|
436
|
+
|
|
301
437
|
The final disposition must come from the completed intake. Direct push to the main branch, PR creation, deploy, publication, paid operations, production data changes, credential use, and destructive operations require explicit answers in the relevant intake fields.
|
|
302
438
|
|
|
303
439
|
Even in `autonomous_goal`, stop and ask when any required intake answer is missing or ambiguous, required sources are missing, acceptance cannot be verified, a worker needs scope expansion, an irreversible action lacks intake authorization, or higher-priority instructions require approval.
|
|
@@ -306,7 +442,7 @@ When `autonomous_goal` stops for a human decision, it should usually leave the C
|
|
|
306
442
|
|
|
307
443
|
## Portable Worker Delegation
|
|
308
444
|
|
|
309
|
-
After the path gate is satisfied, use the selected automated worker transport. The portable default is the package helper:
|
|
445
|
+
After the path gate is satisfied, use the selected automated worker transport. Prefer the portable delegate path when it satisfies the work because it is one-shot and does not leave a native thread or subagent resource open. The portable default is the package helper:
|
|
310
446
|
|
|
311
447
|
```text
|
|
312
448
|
workflow-supervisor delegate --agent <agent> --role <role> --unit <unit-id> --cwd <workspace> --dossier <path>
|
|
@@ -320,7 +456,7 @@ workflow-supervisor validate-dossier <path> --role <role> --unit <unit-id> --jso
|
|
|
320
456
|
|
|
321
457
|
If the dossier does not pass `DossierV1` validation, do not start the worker. Create a discovery dossier, ask for the missing decision, or mark the unit BLOCKED.
|
|
322
458
|
|
|
323
|
-
Adapters may use native threads, native subagents, or one-shot CLI execution underneath, but the supervisor consumes only the normalized worker report. Use `workflow-supervisor delegate-doctor --agent <agent> --probe` to test the installed local adapter before relying on it for a workflow. If automated delegation is unavailable, mark execution as `worker_agent_unavailable` unless completed intake selected `same_session_phased`.
|
|
459
|
+
Adapters may use native threads, native subagents, or one-shot CLI execution underneath, but the supervisor consumes only the normalized worker report plus the transport lifecycle result. Use `workflow-supervisor delegate-doctor --agent <agent> --probe` to test the installed local adapter before relying on it for a workflow. If automated delegation is unavailable, mark execution as `worker_agent_unavailable` unless completed intake selected `same_session_phased`. If native delegation is available but native close is unavailable, mark execution as `worker_resource_close_unavailable` and choose portable delegation or same-session phased only when intake permits it.
|
|
324
460
|
|
|
325
461
|
Name workers deterministically from the workflow, unit, role, and dossier:
|
|
326
462
|
|
|
@@ -342,7 +478,7 @@ Use one worker per role per work unit unless the loop policy explicitly allows b
|
|
|
342
478
|
- kickoff: role, dossier, sources, acceptance rows, stop gates, report schema
|
|
343
479
|
- checkpoint: request status, blockers, or clarification without expanding scope
|
|
344
480
|
- repair delegation: failed rows, verifier findings, allowed repair surfaces, checks
|
|
345
|
-
- closeout: collect terminal report and confirm no further action is expected
|
|
481
|
+
- closeout: collect terminal report, close any native worker resource, and confirm no further action is expected
|
|
346
482
|
|
|
347
483
|
Workers must not ask the human questions directly, choose final disposition, approve plans, expand scope, or message each other. They return `PASS`, `FAIL`, or `BLOCKED` using the assigned report schema. The supervisor routes blockers, repairs, and human questions.
|
|
348
484
|
|
|
@@ -379,22 +515,31 @@ If any item is unknown and material, stop and ask for the missing decision or ma
|
|
|
379
515
|
|
|
380
516
|
Stop when:
|
|
381
517
|
|
|
518
|
+
- the profile is missing or unclear and cannot be selected from explicit user intent plus controlling source
|
|
382
519
|
- any required intake answer is missing, vague, delegated to judgment, or contradicted by another intake answer
|
|
383
520
|
- source authority cannot be established
|
|
384
521
|
- sources contradict each other on a material requirement
|
|
385
522
|
- the requested scope cannot fit into a bounded work unit
|
|
523
|
+
- `lean_work_unit_runner` is selected but the backlog lacks clear unit ids, source references, boundaries, done signals, or targeted checks
|
|
524
|
+
- a product or integration unit is a vague horizontal phase without observable behavior, demo or verification, valid non-product slice type, or horizontal-slice justification
|
|
525
|
+
- `lean_work_unit_runner` finds a strict-mode risk trigger and the user has not authorized escalation, deferral, or a narrower unit
|
|
386
526
|
- the coverage ledger is missing, incomplete, or contains material requirements classified as future work without explicit user deferral
|
|
387
527
|
- human-in-loop SPEC approval is missing, marked Needs Revision, marked Blocked, or has unanswered Q&A
|
|
388
528
|
- a human decision was answered but affected downstream coverage, SPEC, work units, acceptance, dossiers, or verification have not been refreshed
|
|
389
529
|
- mandatory approval packet, work unit, dossier, worker-agent contract, or acceptance matrix is missing
|
|
390
530
|
- allowed and forbidden surfaces cannot be named
|
|
391
531
|
- acceptance cannot be verified with evidence
|
|
532
|
+
- material outcome evidence is only "tests passed", typecheck, build, or implementation prose without expected-outcome observation
|
|
533
|
+
- a material outcome row is `CONDITIONAL_PASS` but the final workflow is being marked PASS without explicit waiver evidence
|
|
534
|
+
- a required browser, visual, live-service, credential, network, or human-review capability is unavailable and no waiver or blocked status is recorded
|
|
535
|
+
- a bug fix or risky behavior change has only related checks and no red-capable feedback loop or explicit substitute-evidence waiver
|
|
392
536
|
- a verifier is asked to edit or an implementer is asked to self-approve
|
|
393
537
|
- repair loops repeat without new evidence
|
|
394
538
|
- the user requires approval before continuing
|
|
395
539
|
- the selected path is `autonomous_goal` but it was inferred from prompt wording instead of a completed intake answer
|
|
396
540
|
- an irreversible action is requested without explicit authorization in the completed intake
|
|
397
541
|
- a worker asks to expand scope without supervisor or human approval
|
|
542
|
+
- a native thread or subagent worker has no recorded close result
|
|
398
543
|
- final verification is not green and no waiver evidence exists
|
|
399
544
|
- residual risks, skipped checks, future work, or next actions contain unimplemented material source requirements
|
|
400
545
|
|
|
@@ -403,19 +548,23 @@ Stop when:
|
|
|
403
548
|
Report:
|
|
404
549
|
|
|
405
550
|
- Status: PASS, FAIL, BLOCKED, or PARTIAL
|
|
551
|
+
- Profile: lean_work_unit_runner, strict_full_workflow, or planning_only
|
|
406
552
|
- Execution path: autonomous_goal or human_in_loop
|
|
407
553
|
- Goal status and whether a Codex goal was created, reused, skipped, completed, or blocked
|
|
408
554
|
- Objective handled
|
|
409
555
|
- Sources used and gaps
|
|
410
556
|
- SPEC status, Q&A summary, and human decision or autonomous approval policy
|
|
411
557
|
- Source-requirement coverage ledger summary, including deferred or blocked material requirements
|
|
412
|
-
- Work units completed or remaining
|
|
558
|
+
- Work units completed, blocked, failed, escalated, or remaining
|
|
559
|
+
- Compact ledger path or inline ledger summary when in lean mode
|
|
413
560
|
- Approval question id and whether `WAITING_FOR_HUMAN -> ACTIVE` occurred
|
|
414
561
|
- Human decision resume status, affected artifacts, and whether stale downstream artifacts were invalidated
|
|
415
562
|
- Dossiers created or missing
|
|
416
563
|
- Workers delegated, blocked, unavailable, or skipped
|
|
417
|
-
- Worker lifecycle status for each role
|
|
564
|
+
- Worker lifecycle status for each role, including native resource ids and close results when native threads or subagents were used
|
|
418
565
|
- Verification evidence
|
|
566
|
+
- Verification environment and capability limitations when they affected proof strength
|
|
567
|
+
- Outcome evaluation rows, including any `CONDITIONAL_PASS` rows and required external checks
|
|
419
568
|
- Repairs performed or recommended
|
|
420
569
|
- Checks run and skipped
|
|
421
570
|
- Residual risks
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
interface:
|
|
2
2
|
display_name: "Workflow Supervisor"
|
|
3
|
-
short_description: "Run
|
|
4
|
-
default_prompt: "Use $workflow-supervisor
|
|
3
|
+
short_description: "Run lean, strict, or planning-only supervised workflows"
|
|
4
|
+
default_prompt: "Route first: if Workflow Supervisor was not explicitly invoked and the task is a small clear edit with obvious files and acceptance, do not invoke it; execute directly. Use $workflow-supervisor only after this route check. If $workflow-supervisor was explicitly invoked, select the execution profile first: lean_work_unit_runner for large bounded backlogs and low-footprint direct execution, strict_full_workflow for ambiguous/high-risk/delegated work, or planning_only for sequencing without implementation. Ask required intake questions and stop until the user explicitly answers all missing items. Do not infer path, mode, delegation, final disposition, or boundaries from vague keywords. In lean mode, keep work units and a compact ledger, avoid subagents unless explicitly authorized, run targeted checks, and escalate unclear or risky units. In strict mode, create a source-requirement coverage ledger and SPEC review gate before work units, preserve source requirements in acceptance rows, and do not hide unimplemented material requirements in residual risks or future work. For outcome-bearing work, treat implementer output as a claim: require expected outcomes, row-mapped outcome evidence, capability limitations, and evidence strength; row-level CONDITIONAL_PASS is not final green status. Prefer one-shot portable delegation when it satisfies the work. If native threads or subagents are used, record each native resource id, call the native close action such as close_agent after terminal report or blocker capture, and block final outcome if any native worker lacks a close result."
|
|
5
5
|
|
|
6
6
|
policy:
|
|
7
7
|
allow_implicit_invocation: false
|