workflow-supervisor 0.1.2 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +109 -0
- package/README.md +91 -40
- package/bin/workflow-skills.mjs +3 -2
- package/docs/artifacts.md +5 -0
- package/docs/portable-delegation.md +5 -0
- package/docs/skill-reference.md +5 -5
- package/docs/troubleshooting.md +26 -0
- package/package.json +8 -2
- package/skills/acceptance-matrix/SKILL.md +29 -2
- package/skills/loop-policy/SKILL.md +34 -8
- package/skills/work-unit/SKILL.md +19 -0
- package/skills/workflow-docs/SKILL.md +4 -2
- package/skills/workflow-docs/references/goal-resume.md +48 -3
- package/skills/workflow-docs/references/templates.md +2 -0
- package/skills/workflow-docs/references/workflow-control.md +186 -2
- package/skills/workflow-supervisor/SKILL.md +247 -49
- package/skills/workflow-supervisor/agents/openai.yaml +2 -2
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
This changelog was reconstructed from npm publish metadata and git history after the first four package versions were published without GitHub releases or tags.
|
|
4
|
+
|
|
5
|
+
## 0.1.4 - 2026-06-19
|
|
6
|
+
|
|
7
|
+
Prepared for npm publication.
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Added profile-based supervisor execution with `lean_work_unit_runner`, `strict_full_workflow`, and `planning_only`.
|
|
12
|
+
- Added compact lean-runner ledger guidance for large bounded backlogs that need lower memory and less ceremony.
|
|
13
|
+
- Added native worker resource lifecycle rules for thread and subagent transports.
|
|
14
|
+
|
|
15
|
+
### Changed
|
|
16
|
+
|
|
17
|
+
- Changed strict worker lifecycle from logical closeout only to `planned -> handed_off -> acknowledged -> reported -> verified -> resource_closed -> closed`.
|
|
18
|
+
- Required native worker transports to record resource ids, close actions, and close results before final workflow outcome.
|
|
19
|
+
- Made one-shot portable delegation the preferred worker path when it satisfies the work, because it avoids resident native workers.
|
|
20
|
+
|
|
21
|
+
### Fixed
|
|
22
|
+
|
|
23
|
+
- Prevented completed Codex subagents from remaining open after workflow-supervisor runs by requiring `close_agent` for every recorded native `agent_id`.
|
|
24
|
+
- Blocked final PASS when any native worker has no recorded close result.
|
|
25
|
+
- Reduced large-backlog memory pressure by defaulting lean execution to same-session phased work unless workers are explicitly authorized or risk escalation requires them.
|
|
26
|
+
|
|
27
|
+
### Verified
|
|
28
|
+
|
|
29
|
+
- Expanded lifecycle tests to cover profile selection, lean ledgers, native worker resource ids, `close_agent`, and close-result gates.
|
|
30
|
+
|
|
31
|
+
## 0.1.3 - 2026-06-17
|
|
32
|
+
|
|
33
|
+
Published to npm: 2026-06-17 22:09:08 UTC
|
|
34
|
+
|
|
35
|
+
Commit: `154bbd7`
|
|
36
|
+
|
|
37
|
+
### Added
|
|
38
|
+
|
|
39
|
+
- Added resumable SPEC gate behavior so broad source-controlled workflows can pause for human review before final work units, dossiers, and implementation.
|
|
40
|
+
- Added resume guidance for autonomous workflows that block on a human decision, including updates to workflow state, goal state, and decision artifacts.
|
|
41
|
+
- Expanded troubleshooting guidance for broad roadmap scope, residual risks that hide required work, and SPEC review before work units.
|
|
42
|
+
|
|
43
|
+
### Changed
|
|
44
|
+
|
|
45
|
+
- Hardened workflow-supervisor scope coverage so material source requirements, roadmap phases, exit criteria, named systems, and numeric targets must be mapped to work units, explicitly deferred, blocked, or marked non-material.
|
|
46
|
+
- Updated acceptance, loop-policy, work-unit, and workflow-docs instructions to preserve source requirement strength and avoid quiet downgrades.
|
|
47
|
+
|
|
48
|
+
### Verified
|
|
49
|
+
|
|
50
|
+
- Expanded workflow-supervisor lifecycle tests for source coverage, SPEC review, and resume behavior.
|
|
51
|
+
|
|
52
|
+
## 0.1.2 - 2026-06-17
|
|
53
|
+
|
|
54
|
+
Published to npm: 2026-06-17 16:00:10 UTC
|
|
55
|
+
|
|
56
|
+
Commit: `b449656`
|
|
57
|
+
|
|
58
|
+
### Changed
|
|
59
|
+
|
|
60
|
+
- Reworked the workflow-supervisor skill around a stricter worker-agent supervisor architecture.
|
|
61
|
+
- Made explicit supervisor invocation require full intake, work units, dossiers, worker-agent contracts, scoped handoffs, report schema, and verification even for small tasks.
|
|
62
|
+
- Clarified that implementation, verification, repair-authoring, and documentation are separate worker-agent responsibilities when an automated worker path is available.
|
|
63
|
+
- Rewrote the README around the strict worker supervisor model and the current package workflow.
|
|
64
|
+
|
|
65
|
+
### Verified
|
|
66
|
+
|
|
67
|
+
- Added lifecycle coverage for strict supervisor invocation behavior.
|
|
68
|
+
|
|
69
|
+
## 0.1.1 - 2026-06-15
|
|
70
|
+
|
|
71
|
+
Published to npm: 2026-06-15 10:59:19 UTC
|
|
72
|
+
|
|
73
|
+
Commit: `ee4c02b`
|
|
74
|
+
|
|
75
|
+
### Added
|
|
76
|
+
|
|
77
|
+
- Added portable worker delegation for Codex and Claude Code through `workflow-supervisor delegate`.
|
|
78
|
+
- Added `WorkerReportV1` and `DossierV1` schema artifacts plus dossier validation before delegation.
|
|
79
|
+
- Added `delegate-doctor` for adapter inspection and optional probe runs.
|
|
80
|
+
- Added project-scope `.workflow/` ignore handling for local workflow state.
|
|
81
|
+
- Added portable delegation documentation and tests for install, delegation, and lifecycle behavior.
|
|
82
|
+
|
|
83
|
+
### Changed
|
|
84
|
+
|
|
85
|
+
- Renamed the primary package executable path around `workflow-supervisor` while keeping `workflow-skills` as an executable alias.
|
|
86
|
+
- Narrowed certified install/delegation targets to Codex, Claude Code, and generic Markdown contexts.
|
|
87
|
+
- Strengthened validation to include adapter metadata and schema artifacts.
|
|
88
|
+
|
|
89
|
+
### Verified
|
|
90
|
+
|
|
91
|
+
- Added Node test coverage for delegate CLI behavior, installation behavior, portable delegation, and supervisor lifecycle handling.
|
|
92
|
+
|
|
93
|
+
## 0.1.0 - 2026-06-14
|
|
94
|
+
|
|
95
|
+
Published to npm: 2026-06-14 23:35:57 UTC
|
|
96
|
+
|
|
97
|
+
Source: npm tarball contents. The GitHub release tag for this version is a reconstructed source snapshot from the npm tarball because no exact matching commit exists in the branch history for this first publish.
|
|
98
|
+
|
|
99
|
+
### Added
|
|
100
|
+
|
|
101
|
+
- Initial npm package for the workflow-supervisor skill pack.
|
|
102
|
+
- Added the bundled skills: `workflow-supervisor`, `worker-roles`, `acceptance-matrix`, `dossier-builder`, `source-corpus`, `loop-policy`, `work-unit`, and `workflow-docs`.
|
|
103
|
+
- Added the `workflow-supervisor` and `workflow-skills` executables for listing, validating, installing, uninstalling, and emitting portable context.
|
|
104
|
+
- Added Codex, Claude Code, OpenCode, HermesAgent, and generic adapter metadata, plus package documentation, troubleshooting notes, compatibility notes, and a README overview.
|
|
105
|
+
- Added packaging metadata, test coverage, and prepublish validation through `npm run validate`.
|
|
106
|
+
|
|
107
|
+
### Verified
|
|
108
|
+
|
|
109
|
+
- Initial package validation covered skill folder structure, `SKILL.md` metadata, and publishable package layout.
|
package/README.md
CHANGED
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
# Workflow Supervisor
|
|
2
2
|
|
|
3
|
-
Workflow Supervisor is a
|
|
3
|
+
Workflow Supervisor is a profile-based supervision skill pack for agent work that needs to stay organized, resumable, evidence-backed, and proportional to the work.
|
|
4
4
|
|
|
5
|
-
It is for moments when you do not want an agent to
|
|
5
|
+
It is for moments when you do not want an agent to lose the thread halfway through, quietly skip scope, or turn a large backlog into an unreviewable blur. You ask for the supervisor, the supervisor selects the right execution profile, keeps the work units explicit, verifies results with evidence, and leaves a clear outcome trail. Heavy multi-agent ceremony is available when risk justifies it; large pure backlogs can use a lean runner that keeps the agent focused on delivery.
|
|
6
6
|
|
|
7
7
|
Example prompt:
|
|
8
8
|
|
|
@@ -19,8 +19,12 @@ The correct first response is not code. The correct first response is an intake
|
|
|
19
19
|
Workflow Supervisor gives you a repeatable workflow for serious agent tasks:
|
|
20
20
|
|
|
21
21
|
- a complete intake before work starts
|
|
22
|
+
- a profile choice between `lean_work_unit_runner`, `strict_full_workflow`, and `planning_only`
|
|
22
23
|
- a source map, even when the only source is the user prompt
|
|
24
|
+
- a source-requirement coverage ledger so roadmap items and exit criteria cannot disappear
|
|
25
|
+
- a `SPEC.md` review gate where humans can ask questions, request revisions, block, defer, or approve before work units are finalized
|
|
23
26
|
- bounded work units, including `WU-001` for tiny tasks
|
|
27
|
+
- a compact ledger for high-throughput work-unit execution
|
|
24
28
|
- dossiers that tell each worker exactly what to do and what not to touch
|
|
25
29
|
- separate implementer, verifier, repair, and documenter responsibilities
|
|
26
30
|
- structured worker reports instead of loose prose
|
|
@@ -29,7 +33,7 @@ Workflow Supervisor gives you a repeatable workflow for serious agent tasks:
|
|
|
29
33
|
- durable `.workflow/` state when the work needs to survive context loss
|
|
30
34
|
- a final report with checks, risks, workers, and next actions
|
|
31
35
|
|
|
32
|
-
The main design choice is simple:
|
|
36
|
+
The main design choice is simple: supervision is mandatory when requested, but overhead is profile-dependent. Work units preserve clarity. Workers, dossiers, and independent verifier loops are tools for strict or escalated work, not a default tax on every unit.
|
|
33
37
|
|
|
34
38
|
## The Mental Model
|
|
35
39
|
|
|
@@ -86,43 +90,75 @@ flowchart TB
|
|
|
86
90
|
|
|
87
91
|
## What Happens When You Invoke It
|
|
88
92
|
|
|
89
|
-
When you explicitly invoke `workflow-supervisor`, `$workflow-supervisor`, or say to use the skill, the
|
|
93
|
+
When you explicitly invoke `workflow-supervisor`, `$workflow-supervisor`, or say to use the skill, the first decision is the execution profile:
|
|
90
94
|
|
|
91
|
-
|
|
95
|
+
- `lean_work_unit_runner`: for large, already-bounded work-unit backlogs where throughput and low memory matter.
|
|
96
|
+
- `strict_full_workflow`: for ambiguous, high-risk, delegated, security-sensitive, source-of-truth, publication, or cross-system work.
|
|
97
|
+
- `planning_only`: for intake, sequencing, risk review, and recommendations without implementation.
|
|
98
|
+
|
|
99
|
+
Lean mode keeps work units but removes per-unit ceremony. It uses one upfront scope contract, one compact ledger, targeted checks, and strict escalation gates:
|
|
100
|
+
|
|
101
|
+
```text
|
|
102
|
+
select next ready unit
|
|
103
|
+
-> inspect only needed sources
|
|
104
|
+
-> patch or update the allowed surface
|
|
105
|
+
-> run the targeted check
|
|
106
|
+
-> update one ledger row
|
|
107
|
+
-> continue until batch checkpoint, blocker, or final disposition
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
A lean unit is not ready unless it has:
|
|
111
|
+
|
|
112
|
+
```yaml
|
|
113
|
+
id:
|
|
114
|
+
source_ref:
|
|
115
|
+
scope:
|
|
116
|
+
done:
|
|
117
|
+
check:
|
|
118
|
+
status:
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Strict mode is still available when risk justifies it. In strict mode, task size does not matter. The full workflow is:
|
|
92
122
|
|
|
93
123
|
1. Ask the complete intake packet.
|
|
94
124
|
2. Build or record the source corpus.
|
|
95
|
-
3. Create
|
|
96
|
-
4. Create
|
|
97
|
-
5.
|
|
98
|
-
6. Create
|
|
99
|
-
7.
|
|
100
|
-
8.
|
|
101
|
-
9.
|
|
102
|
-
10.
|
|
103
|
-
11.
|
|
104
|
-
12.
|
|
105
|
-
|
|
106
|
-
|
|
125
|
+
3. Create a source-requirement coverage ledger.
|
|
126
|
+
4. Create a `SPEC.md` review packet or file.
|
|
127
|
+
5. Pause for human Q&A, revisions, block, defer, or approval when the path is human-in-loop.
|
|
128
|
+
6. Create at least one work unit.
|
|
129
|
+
7. Create acceptance rows that preserve source-scope fidelity.
|
|
130
|
+
8. Create dossiers for the planned workers.
|
|
131
|
+
9. Create a worker-agent plan.
|
|
132
|
+
10. Ask for approval when the selected path is human-in-loop.
|
|
133
|
+
11. Delegate scoped work to real workers when the environment supports it.
|
|
134
|
+
12. Verify with evidence.
|
|
135
|
+
13. Route repair work if verification fails.
|
|
136
|
+
14. Refresh docs or outcome state.
|
|
137
|
+
15. Report final status and next action.
|
|
138
|
+
|
|
139
|
+
Profile selection exists to prevent both failure modes: skipping supervision when work is risky, and drowning simple or already-bounded work in process.
|
|
107
140
|
|
|
108
141
|
## Intake
|
|
109
142
|
|
|
110
|
-
The supervisor must get explicit answers to these
|
|
143
|
+
The supervisor must get explicit answers to these eight items before planning deeply, creating a goal, delegating workers, implementing, publishing, or taking irreversible action:
|
|
111
144
|
|
|
112
145
|
```text
|
|
113
146
|
1. Objective and source: what artifact, spec, repo path, document, ticket, or source set controls the work?
|
|
114
|
-
2.
|
|
115
|
-
3.
|
|
116
|
-
4.
|
|
117
|
-
5.
|
|
118
|
-
6.
|
|
119
|
-
7.
|
|
147
|
+
2. Profile: lean_work_unit_runner, strict_full_workflow, or planning_only?
|
|
148
|
+
3. Execution path: autonomous_goal or human_in_loop?
|
|
149
|
+
4. Mode: sequential, parallel where safe, or staged parallel?
|
|
150
|
+
5. Delegation: same-session phased, automated worker delegation, or native threads/subagents if available?
|
|
151
|
+
6. Final disposition: keep local, open PR, push main, deploy/publish, or ask at the end?
|
|
152
|
+
7. Boundaries: may I install dependencies, call external services, use credentials, or only edit local files?
|
|
153
|
+
8. State artifacts: compact ledger, .workflow docs, another artifact directory, or inline state?
|
|
120
154
|
```
|
|
121
155
|
|
|
122
156
|
If any answer is missing or vague, the supervisor asks only for the missing pieces and stops. Phrases like "work autonomously", "just do it", or "use your judgment" do not fill in the missing intake fields.
|
|
123
157
|
|
|
124
158
|
Expected human pauses are normal. A workflow can move from `WAITING_FOR_HUMAN` back to `ACTIVE` after the user approves a plan or answers a blocker question.
|
|
125
159
|
|
|
160
|
+
In `autonomous_goal`, a human clarification pause is not automatically a terminal failed goal. The supervisor records the blocker, asks the smallest needed question, updates SPEC/Q&A/coverage state when the answer arrives, refreshes only affected downstream artifacts, and resumes from the recorded next action. If an old Codex goal was already terminal-blocked, the resumed workflow references it as history and continues from workflow state or a newly authorized goal binding.
|
|
161
|
+
|
|
126
162
|
## The Workflow
|
|
127
163
|
|
|
128
164
|
The full loop looks like this:
|
|
@@ -130,6 +166,8 @@ The full loop looks like this:
|
|
|
130
166
|
```text
|
|
131
167
|
complete intake
|
|
132
168
|
-> source corpus
|
|
169
|
+
-> source-requirement coverage ledger
|
|
170
|
+
-> SPEC review and Q&A gate
|
|
133
171
|
-> work units
|
|
134
172
|
-> loop policy
|
|
135
173
|
-> acceptance matrix
|
|
@@ -147,10 +185,16 @@ complete intake
|
|
|
147
185
|
The worker lifecycle is tracked separately:
|
|
148
186
|
|
|
149
187
|
```text
|
|
150
|
-
planned -> handed_off -> acknowledged -> reported -> verified -> closed
|
|
188
|
+
planned -> handed_off -> acknowledged -> reported -> verified -> resource_closed -> closed
|
|
151
189
|
```
|
|
152
190
|
|
|
153
|
-
This makes it possible to see where the workflow is, which worker owns which piece, what evidence exists, and
|
|
191
|
+
This makes it possible to see where the workflow is, which worker owns which piece, what evidence exists, what native resource was opened, and whether that resource was closed. A native worker is not closed just because it returned a report.
|
|
192
|
+
|
|
193
|
+
For source-of-truth builds, the coverage ledger is the guardrail against "green but incomplete" outcomes. Every material source requirement must be mapped to a work unit and acceptance row, explicitly deferred by the user, blocked as a scope decision, or marked non-material with a reason. Residual risks and future-work notes cannot contain unimplemented material source requirements in a PASS workflow.
|
|
194
|
+
|
|
195
|
+
`SPEC.md` is the human review contract before final work units. In human-in-loop mode, the supervisor stops at the draft SPEC so the human can ask questions, request revisions, mark items deferred, block the workflow, or approve. The workflow continues only after explicit approval.
|
|
196
|
+
|
|
197
|
+
When a workflow pauses for a human decision, the decision is recorded as state rather than treated as a restart. The next supervisor pass updates the affected coverage rows, SPEC fields, work units, acceptance rows, dossiers, or verification results, invalidates stale artifacts, and continues from the saved `Next Action`.
|
|
154
198
|
|
|
155
199
|
## Skills In The Pack
|
|
156
200
|
|
|
@@ -181,16 +225,17 @@ Common workflow files:
|
|
|
181
225
|
|---|---|---|
|
|
182
226
|
| `.workflow/WORKFLOW.md` | `workflow-supervisor`, `loop-policy`, `workflow-docs` | Main state, objective, execution path, policy, stop gates, next action. |
|
|
183
227
|
| `.workflow/SOURCE-CORPUS.md` | `source-corpus`, `workflow-docs` | Source ranking, missing sources, contradictions, assumptions. |
|
|
228
|
+
| `.workflow/SPEC.md` | `workflow-supervisor`, `source-corpus`, `workflow-docs` | Human-reviewable interpretation, requirement coverage, Q&A, and approval decision before work units. |
|
|
184
229
|
| `.workflow/WORK-UNITS.md` | `work-unit`, `workflow-docs` | Unit list, dependencies, sequencing, blocked units. |
|
|
185
230
|
| `.workflow/DOSSIER.md` or `.workflow/dossiers/*.yaml` | `dossier-builder`, `workflow-docs` | Worker contracts for implementation, verification, repair, or documentation. |
|
|
186
|
-
| `.workflow/WORKER-MAP.md` | `workflow-supervisor`, `worker-roles`, `workflow-docs` | Worker names, roles, transports, lifecycle, reports, blockers. |
|
|
231
|
+
| `.workflow/WORKER-MAP.md` | `workflow-supervisor`, `worker-roles`, `workflow-docs` | Worker names, roles, transports, native resource ids, lifecycle, reports, close results, blockers. |
|
|
187
232
|
| `.workflow/ACCEPTANCE-MATRIX.md` | `acceptance-matrix`, `workflow-docs` | Evidence rows and material PASS, FAIL, BLOCKED states. |
|
|
188
233
|
| `.workflow/VERIFICATION-REPORT.md` | verifier worker, `acceptance-matrix`, `workflow-docs` | Verification evidence, findings, skipped checks, residual risks. |
|
|
189
234
|
| `.workflow/REPAIR-TICKETS.md` | repair worker, `workflow-docs` | Repair tasks tied to failed rows or verifier findings. |
|
|
190
235
|
| `.workflow/DECISIONS.md` | supervisor, `workflow-docs` | User decisions, assumptions, reversals, unresolved questions. |
|
|
191
236
|
| `.workflow/HANDOFF.md` | supervisor, `workflow-docs` | Resume pack for another agent or later session. |
|
|
192
237
|
| `.workflow/OUTCOME.md` | supervisor, documenter worker, `workflow-docs` | Final status, checks, risks, disposition, next action. |
|
|
193
|
-
| `.workflow/GOAL-STATE.md` | supervisor, `workflow-docs` | Codex goal mirror
|
|
238
|
+
| `.workflow/GOAL-STATE.md` | supervisor, `workflow-docs` | Codex goal mirror, blocked-goal history, human-decision resume checkpoint, and durable backup. |
|
|
194
239
|
|
|
195
240
|
For documentation-heavy workflows, `workflow-docs` can also create:
|
|
196
241
|
|
|
@@ -297,9 +342,11 @@ Every delegated worker returns this machine-shaped report:
|
|
|
297
342
|
|
|
298
343
|
The supervisor trusts the report shape, not loose prose. A PASS without evidence is invalid. A verifier that edits implementation is invalid. A worker that asks the human directly is converted into a blocker for the supervisor to route.
|
|
299
344
|
|
|
345
|
+
For native threads or subagents, the report is only the work result. The supervisor must also close the native resource. For Codex subagents, record the returned `agent_id` and call `close_agent` after the report, timeout, failure, blocker, cancellation, or invalid-output result is captured. Final outcome is blocked while any native worker lacks a close result.
|
|
346
|
+
|
|
300
347
|
## How The Supervisor Talks To Workers
|
|
301
348
|
|
|
302
|
-
The portable worker path is one CLI command:
|
|
349
|
+
The portable worker path is one CLI command and is preferred when it satisfies the work because it is one-shot:
|
|
303
350
|
|
|
304
351
|
```bash
|
|
305
352
|
workflow-supervisor delegate \
|
|
@@ -335,9 +382,11 @@ workflow-supervisor delegate-doctor --agent all --probe --require-pass
|
|
|
335
382
|
|
|
336
383
|
If a worker adapter is missing, unauthenticated, times out, returns invalid output, edits forbidden surfaces, or returns PASS without evidence, the delegate command returns a structured `BLOCKED` report.
|
|
337
384
|
|
|
385
|
+
Native thread or subagent transports may be used only when the environment exposes the full lifecycle: create, wait or receive a terminal report, and close. If a native transport can start workers but cannot close them, the supervisor records `worker_resource_close_unavailable` and uses portable delegation or same-session phased work only when intake allows it.
|
|
386
|
+
|
|
338
387
|
## No Silent Fallbacks
|
|
339
388
|
|
|
340
|
-
|
|
389
|
+
In `strict_full_workflow` with worker delegation selected, if the environment can create, message, or delegate to worker agents, the supervisor must use real workers for implementation, verification, repair, and documentation responsibilities.
|
|
341
390
|
|
|
342
391
|
If it cannot, it must record:
|
|
343
392
|
|
|
@@ -347,7 +396,7 @@ worker_agent_unavailable
|
|
|
347
396
|
|
|
348
397
|
Then it must stop for a human decision unless complete intake explicitly selected `same_session_phased`.
|
|
349
398
|
|
|
350
|
-
|
|
399
|
+
In `lean_work_unit_runner`, same-session phased execution is the default unless the user explicitly authorizes workers for a batch or escalation. Verification in same-session mode is a `focused-check` or `self-check`, not an `independent-verifier`.
|
|
351
400
|
|
|
352
401
|
## Install
|
|
353
402
|
|
|
@@ -410,10 +459,12 @@ You should expect:
|
|
|
410
459
|
1. The supervisor asks the complete intake packet.
|
|
411
460
|
2. You answer every intake item.
|
|
412
461
|
3. If the path is `human_in_loop`, the supervisor gives you an approval packet before implementation.
|
|
413
|
-
4. The supervisor creates
|
|
414
|
-
5.
|
|
415
|
-
6.
|
|
416
|
-
7. The supervisor
|
|
462
|
+
4. The supervisor creates the source-requirement coverage ledger and `SPEC.md`.
|
|
463
|
+
5. You ask questions, request revisions, block, defer, or approve the SPEC.
|
|
464
|
+
6. After approval, the supervisor creates work units, acceptance rows, and dossiers.
|
|
465
|
+
7. The supervisor delegates scoped work to workers when supported.
|
|
466
|
+
8. Workers return structured reports.
|
|
467
|
+
9. The supervisor verifies, routes repairs if needed, and gives you the final result.
|
|
417
468
|
|
|
418
469
|
If you only want a normal quick edit, do not invoke `$workflow-supervisor`.
|
|
419
470
|
|
|
@@ -520,11 +571,11 @@ If you are an agent using this package:
|
|
|
520
571
|
|
|
521
572
|
1. Do not start work before complete intake.
|
|
522
573
|
2. Do not infer missing permissions from words like "autonomous", "generate", or "work until done".
|
|
523
|
-
3. If `$workflow-supervisor` is explicit,
|
|
524
|
-
4.
|
|
525
|
-
5.
|
|
526
|
-
6.
|
|
527
|
-
7. Treat same-session verification as `self-check`, not `independent-verifier`.
|
|
574
|
+
3. If `$workflow-supervisor` is explicit, select `lean_work_unit_runner`, `strict_full_workflow`, or `planning_only` before heavy planning.
|
|
575
|
+
4. Always keep work units explicit; lean mode uses a compact ledger instead of full per-unit ceremony.
|
|
576
|
+
5. Do not delegate without a valid `DossierV1`.
|
|
577
|
+
6. Use separate worker agents in strict or explicitly delegated work, not by default for lean execution.
|
|
578
|
+
7. Treat same-session verification as `focused-check` or `self-check`, not `independent-verifier`.
|
|
528
579
|
8. Trust only structured `WorkerReportV1` results from delegated workers.
|
|
529
580
|
9. Treat verifier edits as invalid.
|
|
530
581
|
10. Keep `.workflow/` ignored and local unless the user explicitly asks to publish it.
|
package/bin/workflow-skills.mjs
CHANGED
|
@@ -8,8 +8,9 @@ import { fileURLToPath } from "node:url";
|
|
|
8
8
|
|
|
9
9
|
const __dirname = path.dirname(fileURLToPath(import.meta.url));
|
|
10
10
|
const packageRoot = path.resolve(__dirname, "..");
|
|
11
|
-
const
|
|
12
|
-
const
|
|
11
|
+
const packageJson = JSON.parse(fs.readFileSync(path.join(packageRoot, "package.json"), "utf8"));
|
|
12
|
+
const PACKAGE_NAME = packageJson.name || "workflow-supervisor";
|
|
13
|
+
const PACKAGE_VERSION = packageJson.version;
|
|
13
14
|
const WORKER_REPORT_SCHEMA_PATH = path.join(packageRoot, "schemas", "worker-report-v1.schema.json");
|
|
14
15
|
const DOSSIER_SCHEMA_PATH = path.join(packageRoot, "schemas", "dossier-v1.schema.json");
|
|
15
16
|
const ADAPTERS_ROOT = path.join(packageRoot, "adapters");
|
package/docs/artifacts.md
CHANGED
|
@@ -8,6 +8,7 @@ In Git-backed codebases, `.workflow/` is local working state. Ensure `<workspace
|
|
|
8
8
|
|
|
9
9
|
## Workflow Control
|
|
10
10
|
|
|
11
|
+
- `.workflow/LEDGER.md`
|
|
11
12
|
- `.workflow/WORKFLOW.md`
|
|
12
13
|
- `.workflow/SOURCE-CORPUS.md`
|
|
13
14
|
- `.workflow/WORK-UNITS.md`
|
|
@@ -40,3 +41,7 @@ In Git-backed codebases, `.workflow/` is local working state. Ensure `<workspace
|
|
|
40
41
|
## State Medium
|
|
41
42
|
|
|
42
43
|
Markdown is the default, but state may also be an inline brief, spreadsheet tab, ticket set, design annotation, CRM note, runbook, decision log, slide appendix, whiteboard note, or chat continuation note.
|
|
44
|
+
|
|
45
|
+
For `lean_work_unit_runner`, prefer one compact ledger over multiple workflow documents. Each executable row should carry `id`, `source_ref`, `scope`, `done`, `check`, `status`, touched surfaces, and blockers. Escalated units may link to strict-mode SPEC, dossier, or verification artifacts only when needed.
|
|
46
|
+
|
|
47
|
+
For native thread or subagent delegation, `WORKER-MAP.md` must record the native resource id, terminal report, close action, and close result. Do not mark a native worker closed until the resource close is recorded.
|
|
@@ -18,6 +18,10 @@ complete intake
|
|
|
18
18
|
-> final supervisor report
|
|
19
19
|
```
|
|
20
20
|
|
|
21
|
+
This document describes strict or explicitly delegated execution. `lean_work_unit_runner` normally stays in same-session phased execution with a compact ledger and targeted checks. It should enter portable delegation only when the user authorizes workers for a batch or a unit hits a strict-mode escalation trigger.
|
|
22
|
+
|
|
23
|
+
Prefer portable delegation over native threads or subagents when it satisfies the work. Portable delegation is one-shot, so the worker process exits after the report. Native thread or subagent transports are allowed only when the supervisor can record the native resource id and call the matching close operation after terminal report, timeout, blocker, cancellation, or invalid output.
|
|
24
|
+
|
|
21
25
|
The supervisor remains the only coordinator. Workers do not ask the human questions, choose final disposition, expand scope, approve plans, or talk to each other. If a worker needs a decision, it returns `BLOCKED` with a `blocking_question`; only the supervisor asks the user.
|
|
22
26
|
|
|
23
27
|
## Non-Goals
|
|
@@ -163,6 +167,7 @@ For git workspaces, the surface guard compares pre/post git status. Mutable role
|
|
|
163
167
|
| Repair expands scope | Reject unless the repair dossier explicitly allowed the new surfaces and criteria. |
|
|
164
168
|
| Units touch same surfaces | Run sequentially. Parallel delegation requires proven disjoint mutable surfaces. |
|
|
165
169
|
| Platform has no native subagents | Fine. Each role is a fresh one-shot CLI process. |
|
|
170
|
+
| Native subagent close is unavailable | Do not spawn it. Return `worker_resource_close_unavailable` and use portable delegation or same-session phased work only if intake allowed it. |
|
|
166
171
|
| Platform output differs | Platform output is not the contract. `WorkerReportV1` is the only supervisor input. |
|
|
167
172
|
| Platform cannot support a role safely | Adapter role is unsupported. Supervisor chooses another certified adapter or blocks. |
|
|
168
173
|
| Full support is claimed but one CLI is absent | `delegate-doctor --agent all --probe --require-pass` exits nonzero and names the missing adapter. |
|
package/docs/skill-reference.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## `workflow-supervisor`
|
|
4
4
|
|
|
5
|
-
Coordinate explicit supervised or agent-loop workflows. It
|
|
5
|
+
Coordinate explicit supervised or agent-loop workflows with profile-based overhead. It starts by selecting `lean_work_unit_runner`, `strict_full_workflow`, or `planning_only`, then completes the intake needed for that profile before implementation, goal binding, worker delegation, or final disposition. The user must answer required intake items; the supervisor must not infer path, mode, delegation, final disposition, or boundaries from vague keywords. Lean mode is for large already-bounded work-unit backlogs: it keeps a compact ledger with unit id, source reference, scope, done signal, check, status, touched surfaces, and blockers, then executes one ready unit at a time with targeted checks and escalation gates. Strict mode creates a source-requirement coverage ledger and SPEC review gate before work units so controlling-source deliverables, roadmap phases, and exit criteria are either implemented, explicitly deferred, blocked, or marked non-material. In human-in-loop mode, the human can ask questions, request revisions, block, defer, or approve before execution. In autonomous goal mode, human clarification pauses resume from recorded workflow state after the answer updates only affected downstream artifacts. Strict mode can orchestrate named workers from dossiers through the portable delegate command or an approved native adapter. Native threads and subagents require a recorded native resource id plus a close result, such as `close_agent` for Codex subagents, before a worker is `closed`. Loading the skill itself does not spawn workers. It binds Codex goals only after complete intake and when the user or environment authorizes goal-oriented work, checks active goal state first, avoids unrelated active-goal collisions, and treats terminal blocked goals as history when resuming through workflow docs.
|
|
6
6
|
|
|
7
7
|
## `source-corpus`
|
|
8
8
|
|
|
@@ -10,7 +10,7 @@ Rank and reconcile sources when authority, freshness, contradictions, access rig
|
|
|
10
10
|
|
|
11
11
|
## `work-unit`
|
|
12
12
|
|
|
13
|
-
Split broad work into bounded units with objective, scope, dependencies, readiness, done criteria, verification, sequencing, and parallel-safety notes.
|
|
13
|
+
Split broad work into bounded units with objective, scope, dependencies, readiness, done criteria, verification, sequencing, and parallel-safety notes. It prevents broad roadmap or source-of-truth requests from collapsing into one giant unit unless all current-scope material requirements can be implemented and verified in that unit.
|
|
14
14
|
|
|
15
15
|
## `dossier-builder`
|
|
16
16
|
|
|
@@ -22,12 +22,12 @@ Define role contracts and solo-mode phase separation. It prevents role bleed: ve
|
|
|
22
22
|
|
|
23
23
|
## `acceptance-matrix`
|
|
24
24
|
|
|
25
|
-
Create formal evidence-mapped acceptance rows for high-risk, supervised, ambiguous, resumable, or delegated workflows.
|
|
25
|
+
Create formal evidence-mapped acceptance rows for high-risk, supervised, ambiguous, resumable, or delegated workflows. Rows must preserve source requirement strength, including named systems, quantities, live integration language, and exit criteria; weaker proxy checks require explicit user waiver or scope narrowing.
|
|
26
26
|
|
|
27
27
|
## `loop-policy`
|
|
28
28
|
|
|
29
|
-
Define execution path, execution mode, worker delegation, approval gates, repair limits, parallel safety, no-progress rules, and Codex goal tool policy.
|
|
29
|
+
Define execution path, execution mode, worker delegation, approval gates, repair limits, parallel safety, no-progress rules, human-decision resume rules, and Codex goal tool policy.
|
|
30
30
|
|
|
31
31
|
## `workflow-docs`
|
|
32
32
|
|
|
33
|
-
Create durable workflow-state or documentation-production artifacts. Markdown artifacts default to `<workspace>/.workflow/` unless the user or project convention says otherwise. It also supports inline briefs, tickets, design annotations, runbooks, decision logs, and other usable state media.
|
|
33
|
+
Create durable workflow-state or documentation-production artifacts. Markdown artifacts default to `<workspace>/.workflow/` unless the user or project convention says otherwise. It includes `SPEC.md` for human-readable interpretation, Q&A, and approval before work units, plus `GOAL-STATE.md` and resume fields for blocked-goal history and human-decision continuation. It also supports inline briefs, tickets, design annotations, runbooks, decision logs, and other usable state media.
|
package/docs/troubleshooting.md
CHANGED
|
@@ -23,10 +23,36 @@ Use `.workflow/GOAL-STATE.md` or a workflow continuation document. The superviso
|
|
|
23
23
|
|
|
24
24
|
Use `$workflow-docs` with a minimal artifact request. The skill must reject "create every document just in case."
|
|
25
25
|
|
|
26
|
+
## Large backlogs run slowly or exhaust memory
|
|
27
|
+
|
|
28
|
+
Use `lean_work_unit_runner` instead of `strict_full_workflow` when the source already contains clear work units and the user's priority is throughput. Keep one compact ledger with `id`, `source_ref`, `scope`, `done`, `check`, `status`, touched surfaces, and blockers. Run one unit at a time by default, avoid subagents unless explicitly authorized, avoid broad scans unless required for the current unit, and checkpoint by batch rather than rewriting full workflow docs after every unit.
|
|
29
|
+
|
|
30
|
+
Do not remove work units to make the process lean. If a unit cannot name its boundary, done signal, or targeted check, mark it `blocked` or escalate that unit to strict mode.
|
|
31
|
+
|
|
32
|
+
## Native subagents remain open after completion
|
|
33
|
+
|
|
34
|
+
Treat this as a lifecycle bug, not a cosmetic cleanup task. A terminal report or completed notification does not close a native Codex subagent. Record every native worker id in `WORKER-MAP.md`, call the native close action such as `close_agent` after the terminal report or blocker is captured, and block the final outcome if any native worker lacks a close result. Prefer one-shot portable delegation when it satisfies the work.
|
|
35
|
+
|
|
26
36
|
## Verification rubber-stamps the result
|
|
27
37
|
|
|
28
38
|
Use `$acceptance-matrix` for formal evidence rows. A PASS requires row-by-row evidence or explicit waiver evidence.
|
|
29
39
|
|
|
40
|
+
## A broad roadmap becomes one giant work unit
|
|
41
|
+
|
|
42
|
+
Use the source-requirement coverage gate before work-unit finalization. Every material roadmap item, exit criterion, named integration, and numeric target should be mapped to a unit and acceptance row, explicitly deferred by the user, blocked for a decision, or marked non-material with a reason. Do not accept "future work" or residual risk notes as a substitute for work units.
|
|
43
|
+
|
|
44
|
+
## Residual risks contain required work
|
|
45
|
+
|
|
46
|
+
Treat this as FAIL or BLOCKED. Residual risks may describe remaining uncertainty after acceptance, but they must not contain unimplemented material source requirements, skipped mandatory checks, or source-of-truth deliverables that were quietly downgraded.
|
|
47
|
+
|
|
48
|
+
## Humans need to review scope before work units
|
|
49
|
+
|
|
50
|
+
Create or refresh `.workflow/SPEC.md` before final work units. The human can ask questions in the Q&A section, request revision, block the workflow, defer items, or approve. In `human_in_loop`, the supervisor must not continue to final work units, dossiers, or implementation until the SPEC decision is approved and Q&A is answered.
|
|
51
|
+
|
|
52
|
+
## Autonomous workflow paused for a human decision
|
|
53
|
+
|
|
54
|
+
Record the blocker before asking the human. When the answer arrives, update `.workflow/SPEC.md`, `.workflow/WORKFLOW.md`, `.workflow/GOAL-STATE.md`, and `DECISIONS.md` when present. Re-run only the affected coverage, SPEC, work-unit, acceptance, dossier, worker-plan, verification, or final-disposition steps. Do not restart complete intake unless the answer changes a required intake decision. If the old Codex goal is terminal blocked, reference it as history and continue from workflow state or a newly authorized goal binding.
|
|
55
|
+
|
|
30
56
|
## An existing skill folder blocks install
|
|
31
57
|
|
|
32
58
|
Use:
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "workflow-supervisor",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.4",
|
|
4
4
|
"description": "Portable workflow supervision skills for Codex, Claude Code, and generic agent workspaces.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"repository": {
|
|
@@ -19,9 +19,15 @@
|
|
|
19
19
|
"skills",
|
|
20
20
|
"adapters",
|
|
21
21
|
"schemas",
|
|
22
|
-
"docs",
|
|
22
|
+
"docs/artifacts.md",
|
|
23
|
+
"docs/cli.md",
|
|
24
|
+
"docs/compatibility.md",
|
|
25
|
+
"docs/portable-delegation.md",
|
|
26
|
+
"docs/skill-reference.md",
|
|
27
|
+
"docs/troubleshooting.md",
|
|
23
28
|
"assets",
|
|
24
29
|
"bin",
|
|
30
|
+
"CHANGELOG.md",
|
|
25
31
|
"README.md",
|
|
26
32
|
"LICENSE"
|
|
27
33
|
],
|
|
@@ -15,15 +15,39 @@ This skill owns evidence rows and supervisor verdict mapping. `$work-unit` may d
|
|
|
15
15
|
|
|
16
16
|
- Every requirement needs evidence.
|
|
17
17
|
- Evidence must name a source, command, artifact, UI state, test, inspection, or user decision.
|
|
18
|
+
- Acceptance rows must preserve the source requirement's strength: named systems, quantities, live/integration wording, exit criteria, and "must" language.
|
|
19
|
+
- A weaker proxy check is not equivalent evidence unless the user explicitly waives or narrows the original requirement.
|
|
18
20
|
- PASS requires all material rows to be satisfied or explicitly waived by the user.
|
|
19
21
|
- FAIL requires at least one material row with unmet evidence.
|
|
20
22
|
- BLOCKED applies when evidence cannot be obtained or sources conflict.
|
|
21
23
|
- Residual risks must not be hidden inside PASS.
|
|
24
|
+
- If residual risks, skipped checks, future work, or next recommended actions contain an unimplemented material source requirement, the matrix status is FAIL or BLOCKED, not PASS.
|
|
25
|
+
|
|
26
|
+
## Source Fidelity Rules
|
|
27
|
+
|
|
28
|
+
When a source-requirement coverage ledger exists, every `in_current_scope` material requirement needs at least one matrix row. Preserve exact source details that affect scope or verification, including:
|
|
29
|
+
|
|
30
|
+
- named integrations or providers
|
|
31
|
+
- corpus sizes, question counts, coverage thresholds, or latency/cost budgets
|
|
32
|
+
- "live" versus artifact-only behavior
|
|
33
|
+
- required data sources, credentials, services, or indexes
|
|
34
|
+
- roadmap phase exit criteria
|
|
35
|
+
- mandatory checks, screenshots, reports, or review states
|
|
36
|
+
|
|
37
|
+
Do not downgrade requirements while making them testable. Examples of invalid substitutions:
|
|
38
|
+
|
|
39
|
+
- live service load/query verification -> generated export file only
|
|
40
|
+
- required validation corpus size -> small starter fixture
|
|
41
|
+
- named provider support -> generic extension hook
|
|
42
|
+
- required analysis and report generation -> keyword metadata only
|
|
43
|
+
- provider-backed extraction or indexing -> deterministic placeholder logic
|
|
44
|
+
|
|
45
|
+
If a requirement cannot be verified in the current environment, mark it BLOCKED or require a user waiver. Do not convert it into an easier row.
|
|
22
46
|
|
|
23
47
|
## Row Shape
|
|
24
48
|
|
|
25
|
-
| ID | Requirement | Evidence Required | Verification Method | Adversarial Check | Status | Evidence |
|
|
26
|
-
|
|
49
|
+
| ID | Source Ref | Requirement | Evidence Required | Verification Method | Adversarial Check | Status | Evidence |
|
|
50
|
+
|---|---|---|---|---|---|---|---|
|
|
27
51
|
|
|
28
52
|
Use statuses: Pending, PASS, FAIL, BLOCKED, Waived.
|
|
29
53
|
|
|
@@ -46,6 +70,9 @@ Consider:
|
|
|
46
70
|
- missing citation or unsupported claim
|
|
47
71
|
- document structure regression
|
|
48
72
|
- stakeholder requirement omitted
|
|
73
|
+
- source requirement weakened or omitted
|
|
74
|
+
- roadmap exit criteria demoted to future work
|
|
75
|
+
- material requirement hidden in residual risks
|
|
49
76
|
- artifact cannot be reused by a fresh agent or human
|
|
50
77
|
|
|
51
78
|
## Verification Report Shape
|