workflow-supervisor 0.1.1 → 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +397 -317
- package/bin/workflow-skills.mjs +3 -2
- package/docs/skill-reference.md +5 -5
- package/docs/troubleshooting.md +16 -0
- package/package.json +1 -1
- package/skills/acceptance-matrix/SKILL.md +29 -2
- package/skills/loop-policy/SKILL.md +5 -2
- package/skills/work-unit/SKILL.md +19 -0
- package/skills/workflow-docs/SKILL.md +2 -1
- package/skills/workflow-docs/references/goal-resume.md +48 -3
- package/skills/workflow-docs/references/templates.md +2 -0
- package/skills/workflow-docs/references/workflow-control.md +149 -0
- package/skills/workflow-supervisor/SKILL.md +174 -27
- package/skills/workflow-supervisor/agents/openai.yaml +1 -1
package/README.md
CHANGED
|
@@ -1,212 +1,151 @@
|
|
|
1
1
|
# Workflow Supervisor
|
|
2
2
|
|
|
3
|
-
Workflow Supervisor is a
|
|
3
|
+
Workflow Supervisor is a strict supervision skill pack for agent work that needs to stay organized, resumable, and evidence-backed.
|
|
4
4
|
|
|
5
|
-
It turns a
|
|
5
|
+
It is for moments when you do not want an agent to jump straight into implementation, lose the thread halfway through, verify its own work, or quietly skip important handoffs. You ask for the supervisor, the supervisor asks the right setup questions, turns the work into small units, creates worker dossiers, delegates scoped work to real worker agents when possible, verifies the results, and leaves a clear outcome trail.
|
|
6
|
+
|
|
7
|
+
Example prompt:
|
|
6
8
|
|
|
7
9
|
```text
|
|
8
|
-
Use workflow-supervisor to
|
|
10
|
+
Use $workflow-supervisor to build a FastAPI Naive RAG demo for a healthcare specialist agent.
|
|
9
11
|
```
|
|
10
12
|
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
It currently supports certified automated worker delegation for **Codex** and **Claude Code**.
|
|
13
|
+
The correct first response is not code. The correct first response is an intake packet. That is intentional.
|
|
14
14
|
|
|
15
15
|

|
|
16
16
|
|
|
17
|
-
## What
|
|
18
|
-
|
|
19
|
-
Workflow Supervisor is not another agent product. It is a thin coordination layer for agents that already exist.
|
|
20
|
-
|
|
21
|
-
The supervisor is the visible agent in the conversation. It owns the plan, asks the user questions, creates work units, validates worker contracts, launches workers, reads reports, and decides what happens next.
|
|
22
|
-
|
|
23
|
-
Workers are short-lived CLI runs:
|
|
24
|
-
|
|
25
|
-
```bash
|
|
26
|
-
workflow-supervisor delegate --agent codex --role implementer --unit U1 --dossier .workflow/dossiers/U1.yaml
|
|
27
|
-
workflow-supervisor delegate --agent claude-code --role verifier --unit U1 --dossier .workflow/dossiers/U1.yaml
|
|
28
|
-
```
|
|
29
|
-
|
|
30
|
-
Each worker gets only the context it needs. It returns one structured report. The supervisor remains the coordinator.
|
|
31
|
-
|
|
32
|
-
## The Moat
|
|
17
|
+
## What You Get
|
|
33
18
|
|
|
34
|
-
|
|
19
|
+
Workflow Supervisor gives you a repeatable workflow for serious agent tasks:
|
|
35
20
|
|
|
36
|
-
|
|
21
|
+
- a complete intake before work starts
|
|
22
|
+
- a source map, even when the only source is the user prompt
|
|
23
|
+
- a source-requirement coverage ledger so roadmap items and exit criteria cannot disappear
|
|
24
|
+
- a `SPEC.md` review gate where humans can ask questions, request revisions, block, defer, or approve before work units are finalized
|
|
25
|
+
- bounded work units, including `WU-001` for tiny tasks
|
|
26
|
+
- dossiers that tell each worker exactly what to do and what not to touch
|
|
27
|
+
- separate implementer, verifier, repair, and documenter responsibilities
|
|
28
|
+
- structured worker reports instead of loose prose
|
|
29
|
+
- evidence-based verification
|
|
30
|
+
- repair loops that stay tied to failed acceptance rows
|
|
31
|
+
- durable `.workflow/` state when the work needs to survive context loss
|
|
32
|
+
- a final report with checks, risks, workers, and next actions
|
|
37
33
|
|
|
38
|
-
|
|
39
|
-
- no keyword-based skipping
|
|
40
|
-
- bounded work units before implementation
|
|
41
|
-
- machine-checkable `DossierV1` before workers start
|
|
42
|
-
- role separation between implementer, verifier, repair, and documenter
|
|
43
|
-
- normalized `WorkerReportV1` output from every worker
|
|
44
|
-
- evidence required for PASS
|
|
45
|
-
- automatic BLOCKED reports for vague dossiers, missing CLIs, auth failures, invalid output, timeouts, forbidden edits, or verifier mutations
|
|
34
|
+
The main design choice is simple: the supervisor coordinates, workers do scoped work, and the CLI stays small.
|
|
46
35
|
|
|
47
|
-
|
|
36
|
+
## The Mental Model
|
|
48
37
|
|
|
49
|
-
|
|
38
|
+
Think of Workflow Supervisor as a project lead inside the current agent conversation.
|
|
50
39
|
|
|
51
|
-
|
|
40
|
+
The supervisor:
|
|
52
41
|
|
|
53
|
-
- the
|
|
54
|
-
-
|
|
55
|
-
- the
|
|
56
|
-
-
|
|
57
|
-
-
|
|
58
|
-
-
|
|
59
|
-
-
|
|
60
|
-
-
|
|
61
|
-
- every platform has a different output style
|
|
42
|
+
- asks the user for missing decisions
|
|
43
|
+
- decides when work is ready to start
|
|
44
|
+
- creates the plan, units, dossiers, and acceptance rows
|
|
45
|
+
- hands work to role-specific workers
|
|
46
|
+
- reads worker reports
|
|
47
|
+
- routes blockers and repairs
|
|
48
|
+
- decides when verification is good enough
|
|
49
|
+
- applies the final disposition policy
|
|
62
50
|
|
|
63
|
-
|
|
51
|
+
Workers:
|
|
64
52
|
|
|
65
|
-
|
|
53
|
+
- receive one scoped dossier
|
|
54
|
+
- perform one role
|
|
55
|
+
- return one structured report
|
|
56
|
+
- do not talk to the human directly
|
|
57
|
+
- do not choose final disposition
|
|
58
|
+
- do not message each other
|
|
66
59
|
|
|
67
|
-
|
|
60
|
+
The CLI:
|
|
68
61
|
|
|
69
|
-
|
|
62
|
+
- installs the skills
|
|
63
|
+
- validates skill and schema files
|
|
64
|
+
- validates `DossierV1`
|
|
65
|
+
- invokes one worker process
|
|
66
|
+
- validates `WorkerReportV1`
|
|
67
|
+
- returns a normalized report to the supervisor
|
|
70
68
|
|
|
71
|
-
|
|
69
|
+
It is not a daemon, queue, dashboard, scheduler, or full agent harness.
|
|
72
70
|
|
|
73
71
|
```mermaid
|
|
74
72
|
flowchart TB
|
|
75
|
-
User["User"] --> Supervisor["Supervisor agent in current chat"]
|
|
76
|
-
Supervisor -->
|
|
77
|
-
Supervisor -->
|
|
78
|
-
Supervisor -->
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
CLI
|
|
86
|
-
end
|
|
87
|
-
|
|
88
|
-
CLI --> SkillsSource
|
|
89
|
-
CLI --> AdapterDefs
|
|
90
|
-
CLI --> SchemaDefs
|
|
91
|
-
CLI --> Docs
|
|
92
|
-
CLI --> Adapters["Adapter command array"]
|
|
93
|
-
Adapters --> Codex["Codex CLI worker"]
|
|
94
|
-
Adapters --> Claude["Claude Code CLI worker"]
|
|
95
|
-
Codex --> Report["WorkerReportV1 JSON"]
|
|
73
|
+
User["User"] --> Supervisor["Supervisor agent in the current chat"]
|
|
74
|
+
Supervisor --> Intake["Complete intake"]
|
|
75
|
+
Supervisor --> Sources["Source corpus"]
|
|
76
|
+
Supervisor --> Units["Work units"]
|
|
77
|
+
Supervisor --> Matrix["Acceptance matrix"]
|
|
78
|
+
Supervisor --> Dossiers["DossierV1 files"]
|
|
79
|
+
Supervisor --> CLI["workflow-supervisor CLI"]
|
|
80
|
+
CLI --> Codex["Codex worker"]
|
|
81
|
+
CLI --> Claude["Claude Code worker"]
|
|
82
|
+
Codex --> Report["WorkerReportV1"]
|
|
96
83
|
Claude --> Report
|
|
97
84
|
Report --> Supervisor
|
|
85
|
+
Supervisor --> Docs[".workflow/ state"]
|
|
86
|
+
Supervisor --> Outcome["Final report"]
|
|
98
87
|
```
|
|
99
88
|
|
|
100
|
-
|
|
89
|
+
## What Happens When You Invoke It
|
|
101
90
|
|
|
102
|
-
|
|
103
|
-
- `bin/workflow-skills.mjs` contains the installer, validator, context emitter, delegation wrapper, surface guard, and command dispatch.
|
|
104
|
-
- `schemas/` defines `DossierV1` and `WorkerReportV1`.
|
|
105
|
-
- `adapters/` defines certified Codex and Claude Code command arrays.
|
|
106
|
-
- `docs/` explains CLI usage, portable delegation semantics, compatibility, artifacts, and troubleshooting.
|
|
107
|
-
- `.workflow/` is created in consuming projects as private supervisor working memory, not as package state.
|
|
91
|
+
When you explicitly invoke `workflow-supervisor`, `$workflow-supervisor`, or say to use the skill, the workflow enters `strict_full_workflow`.
|
|
108
92
|
|
|
109
|
-
|
|
110
|
-
flowchart LR
|
|
111
|
-
Package["workflow-supervisor package"] --> Install["install"]
|
|
112
|
-
Package --> Emit["emit-context"]
|
|
113
|
-
Package --> Validate["validate and validate-dossier"]
|
|
114
|
-
Package --> Delegate["delegate and delegate-doctor"]
|
|
115
|
-
|
|
116
|
-
Install --> CodexTarget["Codex target: ~/.agents/skills or project .agents/skills"]
|
|
117
|
-
Install --> ClaudeTarget["Claude target: ~/.claude/skills or project .claude/skills"]
|
|
118
|
-
Install --> Gitignore["Project .gitignore contains .workflow/"]
|
|
119
|
-
|
|
120
|
-
Emit --> PortableFile["Portable context file: AGENTS.md or CLAUDE.md"]
|
|
121
|
-
Validate --> SkillGate["Skill, schema, adapter, and dossier gates"]
|
|
122
|
-
Delegate --> WorkerRun["One role-scoped worker CLI process"]
|
|
123
|
-
```
|
|
93
|
+
Strict mode means task size does not matter. Even if the request is "make a function that adds two numbers", explicit supervisor invocation still means the full workflow:
|
|
124
94
|
|
|
125
|
-
|
|
95
|
+
1. Ask the complete intake packet.
|
|
96
|
+
2. Build or record the source corpus.
|
|
97
|
+
3. Create a source-requirement coverage ledger.
|
|
98
|
+
4. Create a `SPEC.md` review packet or file.
|
|
99
|
+
5. Pause for human Q&A, revisions, block, defer, or approval when the path is human-in-loop.
|
|
100
|
+
6. Create at least one work unit.
|
|
101
|
+
7. Create acceptance rows that preserve source-scope fidelity.
|
|
102
|
+
8. Create dossiers for the planned workers.
|
|
103
|
+
9. Create a worker-agent plan.
|
|
104
|
+
10. Ask for approval when the selected path is human-in-loop.
|
|
105
|
+
11. Delegate scoped work to real workers when the environment supports it.
|
|
106
|
+
12. Verify with evidence.
|
|
107
|
+
13. Route repair work if verification fails.
|
|
108
|
+
14. Refresh docs or outcome state.
|
|
109
|
+
15. Report final status and next action.
|
|
126
110
|
|
|
127
|
-
|
|
128
|
-
sequenceDiagram
|
|
129
|
-
participant S as Supervisor
|
|
130
|
-
participant C as "workflow-supervisor delegate"
|
|
131
|
-
participant D as "DossierV1 validator"
|
|
132
|
-
participant G as "Surface guard"
|
|
133
|
-
participant A as "Agent adapter"
|
|
134
|
-
participant W as "Worker CLI"
|
|
135
|
-
|
|
136
|
-
S->>C: Role, unit ID, workspace, dossier path
|
|
137
|
-
C->>D: Parse JSON, YAML, or fenced YAML
|
|
138
|
-
D-->>C: Valid dossier or BLOCKED invalid_dossier
|
|
139
|
-
C->>G: Snapshot git status or explicit surfaces
|
|
140
|
-
C->>A: Build command from adapters/<agent>/adapter.json
|
|
141
|
-
A->>W: Run one CLI process with role prompt and schema
|
|
142
|
-
W-->>A: stdout, stderr, exit code, timeout signal
|
|
143
|
-
A-->>C: Raw worker output
|
|
144
|
-
C->>C: Extract and validate WorkerReportV1
|
|
145
|
-
C->>G: Compare after-state against allowed and forbidden surfaces
|
|
146
|
-
C-->>S: PASS, FAIL, or normalized BLOCKED WorkerReportV1
|
|
147
|
-
```
|
|
111
|
+
This rule exists to prevent the agent from deciding that a task is "too simple" and quietly skipping the supervisor.
|
|
148
112
|
|
|
149
|
-
|
|
113
|
+
## Intake
|
|
150
114
|
|
|
151
|
-
|
|
152
|
-
stateDiagram-v2
|
|
153
|
-
[*] --> Intake
|
|
154
|
-
Intake --> SourceGrounding: Complete intake
|
|
155
|
-
SourceGrounding --> WorkUnits: Sources ranked
|
|
156
|
-
WorkUnits --> AcceptanceMatrix: Units bounded
|
|
157
|
-
AcceptanceMatrix --> Dossier: Evidence rows ready
|
|
158
|
-
Dossier --> Delegation: DossierV1 valid
|
|
159
|
-
Delegation --> Verification: WorkerReportV1 returned
|
|
160
|
-
Verification --> Repair: FAIL or actionable BLOCKED
|
|
161
|
-
Repair --> Verification: Repair report returned
|
|
162
|
-
Verification --> Documentation: PASS with evidence
|
|
163
|
-
Documentation --> FinalDisposition: Outcome recorded
|
|
164
|
-
FinalDisposition --> [*]
|
|
165
|
-
Dossier --> Intake: Missing decision
|
|
166
|
-
Delegation --> Intake: Worker BLOCKED with human question
|
|
167
|
-
```
|
|
115
|
+
The supervisor must get explicit answers to these seven items before planning deeply, creating a goal, delegating workers, implementing, publishing, or taking irreversible action:
|
|
168
116
|
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
- important enough to require independent verification
|
|
179
|
-
- easier to handle as several bounded units
|
|
180
|
-
|
|
181
|
-
Good examples:
|
|
117
|
+
```text
|
|
118
|
+
1. Objective and source: what artifact, spec, repo path, document, ticket, or source set controls the work?
|
|
119
|
+
2. Execution path: autonomous_goal or human_in_loop?
|
|
120
|
+
3. Mode: sequential, parallel where safe, or staged parallel?
|
|
121
|
+
4. Delegation: automated worker delegation, native threads/subagents if available, or same-session phased?
|
|
122
|
+
5. Final disposition: keep local, open PR, push main, deploy/publish, or ask at the end?
|
|
123
|
+
6. Boundaries: may I install dependencies, call external services, use credentials, or only edit local files?
|
|
124
|
+
7. State artifacts: create .workflow docs, use another artifact directory, or keep state inline?
|
|
125
|
+
```
|
|
182
126
|
|
|
183
|
-
|
|
184
|
-
- refactor authentication across several modules
|
|
185
|
-
- update docs from a new API spec
|
|
186
|
-
- implement a feature with tests and verification
|
|
187
|
-
- review and repair a messy PR
|
|
188
|
-
- produce durable workflow docs for a long-running task
|
|
127
|
+
If any answer is missing or vague, the supervisor asks only for the missing pieces and stops. Phrases like "work autonomously", "just do it", or "use your judgment" do not fill in the missing intake fields.
|
|
189
128
|
|
|
190
|
-
|
|
129
|
+
Expected human pauses are normal. A workflow can move from `WAITING_FOR_HUMAN` back to `ACTIVE` after the user approves a plan or answers a blocker question.
|
|
191
130
|
|
|
192
|
-
-
|
|
193
|
-
- one-off shell commands
|
|
194
|
-
- obvious single-file changes
|
|
195
|
-
- quick explanations
|
|
196
|
-
- tasks where a normal agent turn is enough
|
|
131
|
+
In `autonomous_goal`, a human clarification pause is not automatically a terminal failed goal. The supervisor records the blocker, asks the smallest needed question, updates SPEC/Q&A/coverage state when the answer arrives, refreshes only affected downstream artifacts, and resumes from the recorded next action. If an old Codex goal was already terminal-blocked, the resumed workflow references it as history and continues from workflow state or a newly authorized goal binding.
|
|
197
132
|
|
|
198
|
-
##
|
|
133
|
+
## The Workflow
|
|
199
134
|
|
|
200
|
-
The
|
|
135
|
+
The full loop looks like this:
|
|
201
136
|
|
|
202
137
|
```text
|
|
203
|
-
intake
|
|
204
|
-
-> source
|
|
138
|
+
complete intake
|
|
139
|
+
-> source corpus
|
|
140
|
+
-> source-requirement coverage ledger
|
|
141
|
+
-> SPEC review and Q&A gate
|
|
205
142
|
-> work units
|
|
143
|
+
-> loop policy
|
|
206
144
|
-> acceptance matrix
|
|
207
|
-
->
|
|
208
|
-
->
|
|
209
|
-
->
|
|
145
|
+
-> dossiers
|
|
146
|
+
-> approval or autonomous path gate
|
|
147
|
+
-> worker handoff
|
|
148
|
+
-> worker report
|
|
210
149
|
-> verification
|
|
211
150
|
-> repair if needed
|
|
212
151
|
-> re-verification
|
|
@@ -214,257 +153,398 @@ intake
|
|
|
214
153
|
-> final disposition
|
|
215
154
|
```
|
|
216
155
|
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
The supervisor must ask the user for every required decision before it plans deeply or starts work:
|
|
156
|
+
The worker lifecycle is tracked separately:
|
|
220
157
|
|
|
221
158
|
```text
|
|
222
|
-
|
|
223
|
-
2. Execution path: autonomous_goal or human_in_loop
|
|
224
|
-
3. Mode: sequential, parallel where safe, or staged parallel
|
|
225
|
-
4. Delegation: automated workers, native subagents if available, or same-session phased
|
|
226
|
-
5. Final disposition: keep local, open PR, push, deploy, publish, or ask at end
|
|
227
|
-
6. Boundaries: installs, network, credentials, destructive operations, forbidden surfaces
|
|
228
|
-
7. State artifacts: .workflow docs, another directory, or inline state
|
|
159
|
+
planned -> handed_off -> acknowledged -> reported -> verified -> closed
|
|
229
160
|
```
|
|
230
161
|
|
|
231
|
-
|
|
162
|
+
This makes it possible to see where the workflow is, which worker owns which piece, what evidence exists, and what should happen next.
|
|
163
|
+
|
|
164
|
+
For source-of-truth builds, the coverage ledger is the guardrail against "green but incomplete" outcomes. Every material source requirement must be mapped to a work unit and acceptance row, explicitly deferred by the user, blocked as a scope decision, or marked non-material with a reason. Residual risks and future-work notes cannot contain unimplemented material source requirements in a PASS workflow.
|
|
165
|
+
|
|
166
|
+
`SPEC.md` is the human review contract before final work units. In human-in-loop mode, the supervisor stops at the draft SPEC so the human can ask questions, request revisions, mark items deferred, block the workflow, or approve. The workflow continues only after explicit approval.
|
|
232
167
|
|
|
233
|
-
|
|
168
|
+
When a workflow pauses for a human decision, the decision is recorded as state rather than treated as a restart. The next supervisor pass updates the affected coverage rows, SPEC fields, work units, acceptance rows, dossiers, or verification results, invalidates stale artifacts, and continues from the saved `Next Action`.
|
|
234
169
|
|
|
235
|
-
|
|
170
|
+
## Skills In The Pack
|
|
236
171
|
|
|
237
|
-
|
|
172
|
+
The skill pack is made of small focused skills. The supervisor can use them as phase instructions.
|
|
238
173
|
|
|
239
|
-
|
|
174
|
+
| Skill | What it does |
|
|
175
|
+
|---|---|
|
|
176
|
+
| `workflow-supervisor` | Coordinates the whole workflow, gates, workers, verification, repair, and final disposition. |
|
|
177
|
+
| `source-corpus` | Lists and ranks sources, gaps, contradictions, authority, freshness, and allowed next action. |
|
|
178
|
+
| `work-unit` | Turns the objective into bounded units with dependencies, surfaces, readiness, and done criteria. |
|
|
179
|
+
| `loop-policy` | Defines execution path, mode, approval gates, repair limits, budgets, goal policy, and resume behavior. |
|
|
180
|
+
| `acceptance-matrix` | Turns requirements into evidence rows with PASS, FAIL, BLOCKED, and waiver handling. |
|
|
181
|
+
| `dossier-builder` | Creates concrete `DossierV1` contracts for workers. |
|
|
182
|
+
| `worker-roles` | Defines role boundaries so implementers, verifiers, repair authors, and documenters do not blur together. |
|
|
183
|
+
| `workflow-docs` | Creates or refreshes durable `.workflow/` artifacts when state needs to persist. |
|
|
240
184
|
|
|
241
|
-
|
|
185
|
+
Loading a skill does not spawn a worker. A skill is instruction context for the current supervisor. A worker is a separate role-scoped execution run.
|
|
242
186
|
|
|
243
|
-
|
|
187
|
+
## Files The Workflow Creates
|
|
188
|
+
|
|
189
|
+
Workflow state lives under `.workflow/` by default. The directory is local supervisor memory, not product code.
|
|
190
|
+
|
|
191
|
+
In a Git-backed project, `.workflow/` must be in `.gitignore` before these files are written. Project installs do this automatically.
|
|
192
|
+
|
|
193
|
+
Common workflow files:
|
|
194
|
+
|
|
195
|
+
| File | Created from | Purpose |
|
|
196
|
+
|---|---|---|
|
|
197
|
+
| `.workflow/WORKFLOW.md` | `workflow-supervisor`, `loop-policy`, `workflow-docs` | Main state, objective, execution path, policy, stop gates, next action. |
|
|
198
|
+
| `.workflow/SOURCE-CORPUS.md` | `source-corpus`, `workflow-docs` | Source ranking, missing sources, contradictions, assumptions. |
|
|
199
|
+
| `.workflow/SPEC.md` | `workflow-supervisor`, `source-corpus`, `workflow-docs` | Human-reviewable interpretation, requirement coverage, Q&A, and approval decision before work units. |
|
|
200
|
+
| `.workflow/WORK-UNITS.md` | `work-unit`, `workflow-docs` | Unit list, dependencies, sequencing, blocked units. |
|
|
201
|
+
| `.workflow/DOSSIER.md` or `.workflow/dossiers/*.yaml` | `dossier-builder`, `workflow-docs` | Worker contracts for implementation, verification, repair, or documentation. |
|
|
202
|
+
| `.workflow/WORKER-MAP.md` | `workflow-supervisor`, `worker-roles`, `workflow-docs` | Worker names, roles, transports, lifecycle, reports, blockers. |
|
|
203
|
+
| `.workflow/ACCEPTANCE-MATRIX.md` | `acceptance-matrix`, `workflow-docs` | Evidence rows and material PASS, FAIL, BLOCKED states. |
|
|
204
|
+
| `.workflow/VERIFICATION-REPORT.md` | verifier worker, `acceptance-matrix`, `workflow-docs` | Verification evidence, findings, skipped checks, residual risks. |
|
|
205
|
+
| `.workflow/REPAIR-TICKETS.md` | repair worker, `workflow-docs` | Repair tasks tied to failed rows or verifier findings. |
|
|
206
|
+
| `.workflow/DECISIONS.md` | supervisor, `workflow-docs` | User decisions, assumptions, reversals, unresolved questions. |
|
|
207
|
+
| `.workflow/HANDOFF.md` | supervisor, `workflow-docs` | Resume pack for another agent or later session. |
|
|
208
|
+
| `.workflow/OUTCOME.md` | supervisor, documenter worker, `workflow-docs` | Final status, checks, risks, disposition, next action. |
|
|
209
|
+
| `.workflow/GOAL-STATE.md` | supervisor, `workflow-docs` | Codex goal mirror, blocked-goal history, human-decision resume checkpoint, and durable backup. |
|
|
210
|
+
|
|
211
|
+
For documentation-heavy workflows, `workflow-docs` can also create:
|
|
244
212
|
|
|
245
213
|
```text
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
214
|
+
.workflow/DOCUMENTATION-BRIEF.md
|
|
215
|
+
.workflow/CONTENT-INVENTORY.md
|
|
216
|
+
.workflow/OUTLINE.md
|
|
217
|
+
.workflow/CONTENT-DRAFT.md
|
|
218
|
+
.workflow/CLAIMS-REGISTER.md
|
|
219
|
+
.workflow/STYLE-GUIDE.md
|
|
220
|
+
.workflow/GLOSSARY.md
|
|
221
|
+
.workflow/ASSET-REGISTER.md
|
|
222
|
+
.workflow/REVIEW-PLAN.md
|
|
223
|
+
.workflow/REVISION-QUEUE.md
|
|
224
|
+
.workflow/PUBLISHING-CHECKLIST.md
|
|
225
|
+
.workflow/PUBLICATION-LOG.md
|
|
226
|
+
.workflow/MAINTENANCE-PLAN.md
|
|
251
227
|
```
|
|
252
228
|
|
|
253
|
-
|
|
229
|
+
It should not create all of these by default. It should create the smallest useful set.
|
|
230
|
+
|
|
231
|
+
## Dossiers
|
|
254
232
|
|
|
255
|
-
|
|
233
|
+
A dossier is the worker contract. It is how the supervisor prevents vague delegation.
|
|
256
234
|
|
|
257
|
-
|
|
235
|
+
Before any worker starts, the supervisor creates a concrete `DossierV1` with:
|
|
258
236
|
|
|
259
|
-
-
|
|
260
|
-
-
|
|
237
|
+
- workflow name
|
|
238
|
+
- work unit
|
|
239
|
+
- dossier id
|
|
240
|
+
- worker name
|
|
241
|
+
- worker role
|
|
242
|
+
- delegation transport
|
|
243
|
+
- start condition
|
|
244
|
+
- objective and non-goals
|
|
245
|
+
- source corpus and must-read sources
|
|
261
246
|
- allowed surfaces
|
|
262
247
|
- forbidden surfaces
|
|
263
|
-
- must-read sources
|
|
264
248
|
- acceptance rows
|
|
265
|
-
- required evidence
|
|
266
249
|
- adversarial checks
|
|
250
|
+
- required commands or evidence
|
|
251
|
+
- worker prompt
|
|
252
|
+
- supervisor checkpoints
|
|
253
|
+
- report schemas
|
|
267
254
|
- stop gates
|
|
268
|
-
-
|
|
255
|
+
- assumptions
|
|
256
|
+
- open questions
|
|
269
257
|
|
|
270
|
-
|
|
258
|
+
Validate a dossier before delegation:
|
|
271
259
|
|
|
272
260
|
```bash
|
|
273
|
-
workflow-supervisor validate-dossier .workflow/dossiers/
|
|
261
|
+
workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
|
|
274
262
|
```
|
|
275
263
|
|
|
276
|
-
|
|
264
|
+
The validator rejects things like `TBD`, `unknown`, `all files`, `entire repo`, unresolved open questions, role mismatches, unit mismatches, missing forbidden surfaces, and prompts that do not require `WorkerReportV1`.
|
|
277
265
|
|
|
278
|
-
|
|
266
|
+
## Workers
|
|
279
267
|
|
|
280
|
-
The
|
|
268
|
+
The required worker responsibilities are:
|
|
281
269
|
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
270
|
+
| Responsibility | CLI role value | What it does |
|
|
271
|
+
|---|---|---|
|
|
272
|
+
| Implementer | `implementer` | Changes only the allowed surfaces named in the dossier. |
|
|
273
|
+
| Verifier | `verifier` | Checks the work against acceptance rows and must not edit implementation. |
|
|
274
|
+
| Repair author | `repair` | Converts failed rows or verifier findings into actionable repair work. |
|
|
275
|
+
| Documenter | `documenter` | Updates workflow or outcome docs from evidence. |
|
|
276
|
+
|
|
277
|
+
The skill text may say "repair-author" because that is the human role. The CLI schema uses `repair`.
|
|
278
|
+
|
|
279
|
+
Workers receive only their scoped handoff:
|
|
280
|
+
|
|
281
|
+
- role
|
|
282
|
+
- dossier
|
|
283
|
+
- sources
|
|
284
|
+
- acceptance rows
|
|
285
|
+
- stop gates
|
|
286
|
+
- report schema
|
|
287
|
+
|
|
288
|
+
They return one terminal `WorkerReportV1`.
|
|
289
|
+
|
|
290
|
+
## Worker Reports
|
|
285
291
|
|
|
286
|
-
|
|
292
|
+
Every delegated worker returns this machine-shaped report:
|
|
293
|
+
|
|
294
|
+
```json
|
|
295
|
+
{
|
|
296
|
+
"schema": "WorkerReportV1",
|
|
297
|
+
"status": "PASS",
|
|
298
|
+
"role": "verifier",
|
|
299
|
+
"unit_id": "WU-001",
|
|
300
|
+
"summary": "Verified the API responses and retrieval behavior against the acceptance rows.",
|
|
301
|
+
"changed_surfaces": [],
|
|
302
|
+
"evidence": ["pytest tests/test_api.py passed", "manual inspection of /health response"],
|
|
303
|
+
"checks_run": ["pytest tests/test_api.py"],
|
|
304
|
+
"skipped_checks": [],
|
|
305
|
+
"findings": [],
|
|
306
|
+
"blocking_question": null,
|
|
307
|
+
"next_action": "supervisor_review",
|
|
308
|
+
"adapter": null,
|
|
309
|
+
"guard": null,
|
|
310
|
+
"reason": null
|
|
311
|
+
}
|
|
312
|
+
```
|
|
287
313
|
|
|
288
|
-
|
|
314
|
+
The supervisor trusts the report shape, not loose prose. A PASS without evidence is invalid. A verifier that edits implementation is invalid. A worker that asks the human directly is converted into a blocker for the supervisor to route.
|
|
289
315
|
|
|
290
|
-
|
|
316
|
+
## How The Supervisor Talks To Workers
|
|
291
317
|
|
|
292
|
-
|
|
318
|
+
The portable worker path is one CLI command:
|
|
293
319
|
|
|
294
|
-
|
|
320
|
+
```bash
|
|
321
|
+
workflow-supervisor delegate \
|
|
322
|
+
--agent <codex|claude-code> \
|
|
323
|
+
--role <implementer|verifier|repair|documenter> \
|
|
324
|
+
--unit <unit-id> \
|
|
325
|
+
--cwd <workspace> \
|
|
326
|
+
--dossier <path>
|
|
327
|
+
```
|
|
295
328
|
|
|
296
|
-
The
|
|
329
|
+
The command:
|
|
297
330
|
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
331
|
+
1. Validates the dossier as `DossierV1`.
|
|
332
|
+
2. Builds a scoped worker prompt.
|
|
333
|
+
3. Starts the selected agent CLI with an adapter command array.
|
|
334
|
+
4. Captures stdout, stderr, exit code, and timeout.
|
|
335
|
+
5. Extracts and validates `WorkerReportV1`.
|
|
336
|
+
6. Runs surface and role guards.
|
|
337
|
+
7. Prints one normalized JSON report for the supervisor.
|
|
304
338
|
|
|
305
|
-
|
|
339
|
+
Certified worker adapters:
|
|
306
340
|
|
|
307
|
-
|
|
341
|
+
- `codex`
|
|
342
|
+
- `claude-code`
|
|
308
343
|
|
|
309
|
-
The
|
|
344
|
+
The `generic` target is for Markdown instruction export. It is not a certified automated worker adapter.
|
|
310
345
|
|
|
311
|
-
|
|
346
|
+
Check local adapter readiness:
|
|
312
347
|
|
|
313
|
-
```
|
|
314
|
-
|
|
315
|
-
approval packet
|
|
316
|
-
progress summaries
|
|
317
|
-
blocker questions if needed
|
|
318
|
-
final report
|
|
348
|
+
```bash
|
|
349
|
+
workflow-supervisor delegate-doctor --agent all --probe --require-pass
|
|
319
350
|
```
|
|
320
351
|
|
|
321
|
-
|
|
352
|
+
If a worker adapter is missing, unauthenticated, times out, returns invalid output, edits forbidden surfaces, or returns PASS without evidence, the delegate command returns a structured `BLOCKED` report.
|
|
353
|
+
|
|
354
|
+
## No Silent Fallbacks
|
|
355
|
+
|
|
356
|
+
If the environment can create, message, or delegate to worker agents, the supervisor must use real workers for implementation, verification, repair, and documentation responsibilities.
|
|
357
|
+
|
|
358
|
+
If it cannot, it must record:
|
|
322
359
|
|
|
323
360
|
```text
|
|
324
|
-
|
|
325
|
-
execution plan
|
|
326
|
-
periodic progress summaries
|
|
327
|
-
blockers only when needed
|
|
328
|
-
final report
|
|
361
|
+
worker_agent_unavailable
|
|
329
362
|
```
|
|
330
363
|
|
|
331
|
-
|
|
364
|
+
Then it must stop for a human decision unless complete intake explicitly selected `same_session_phased`.
|
|
332
365
|
|
|
333
|
-
|
|
366
|
+
Same-session phased work is allowed only when selected. Verification in that mode is a `self-check`, not an `independent-verifier`.
|
|
334
367
|
|
|
335
|
-
|
|
368
|
+
## Install
|
|
336
369
|
|
|
337
|
-
|
|
370
|
+
Install from npm once published:
|
|
371
|
+
|
|
372
|
+
```bash
|
|
373
|
+
npm install -g workflow-supervisor
|
|
374
|
+
workflow-supervisor validate
|
|
375
|
+
```
|
|
338
376
|
|
|
339
|
-
|
|
377
|
+
Use with `npx`:
|
|
340
378
|
|
|
341
|
-
```
|
|
342
|
-
|
|
343
|
-
SOURCE-CORPUS.md
|
|
344
|
-
WORK-UNITS.md
|
|
345
|
-
DOSSIER.md
|
|
346
|
-
WORKER-MAP.md
|
|
347
|
-
ACCEPTANCE-MATRIX.md
|
|
348
|
-
VERIFICATION-REPORT.md
|
|
349
|
-
REPAIR-TICKETS.md
|
|
350
|
-
DECISIONS.md
|
|
351
|
-
HANDOFF.md
|
|
352
|
-
OUTCOME.md
|
|
353
|
-
GOAL-STATE.md
|
|
379
|
+
```bash
|
|
380
|
+
npx workflow-supervisor list
|
|
354
381
|
```
|
|
355
382
|
|
|
356
|
-
|
|
383
|
+
Install skills for Codex:
|
|
357
384
|
|
|
358
|
-
|
|
385
|
+
```bash
|
|
386
|
+
npx workflow-supervisor install --agent codex --scope user
|
|
387
|
+
```
|
|
359
388
|
|
|
360
|
-
|
|
389
|
+
Install skills for Claude Code:
|
|
361
390
|
|
|
362
|
-
```
|
|
363
|
-
|
|
364
|
-
"schema": "WorkerReportV1",
|
|
365
|
-
"status": "PASS",
|
|
366
|
-
"role": "verifier",
|
|
367
|
-
"unit_id": "U2",
|
|
368
|
-
"summary": "Verified LanceDB-backed search path.",
|
|
369
|
-
"changed_surfaces": [],
|
|
370
|
-
"evidence": ["pytest tests/test_search.py passed"],
|
|
371
|
-
"checks_run": ["pytest tests/test_search.py"],
|
|
372
|
-
"skipped_checks": [],
|
|
373
|
-
"findings": [],
|
|
374
|
-
"blocking_question": null,
|
|
375
|
-
"next_action": "supervisor_review",
|
|
376
|
-
"adapter": null,
|
|
377
|
-
"guard": null,
|
|
378
|
-
"reason": null
|
|
379
|
-
}
|
|
391
|
+
```bash
|
|
392
|
+
npx workflow-supervisor install --agent claude-code --scope user
|
|
380
393
|
```
|
|
381
394
|
|
|
382
|
-
|
|
395
|
+
Install both certified targets into a project:
|
|
383
396
|
|
|
384
|
-
|
|
397
|
+
```bash
|
|
398
|
+
npx workflow-supervisor install --agent all --scope project --project .
|
|
399
|
+
```
|
|
385
400
|
|
|
386
|
-
-
|
|
387
|
-
- goal status
|
|
388
|
-
- sources used
|
|
389
|
-
- work units completed
|
|
390
|
-
- workers delegated
|
|
391
|
-
- checks run
|
|
392
|
-
- skipped checks
|
|
393
|
-
- repairs performed
|
|
394
|
-
- residual risks
|
|
395
|
-
- final disposition
|
|
396
|
-
- next action
|
|
401
|
+
Project installs copy the skill folders into project-level agent directories and ensure the target project `.gitignore` contains:
|
|
397
402
|
|
|
398
|
-
|
|
403
|
+
```gitignore
|
|
404
|
+
.workflow/
|
|
405
|
+
```
|
|
399
406
|
|
|
400
407
|
From a local checkout:
|
|
401
408
|
|
|
402
409
|
```bash
|
|
403
410
|
git clone https://github.com/NikolaCehic/workflow-supervisor.git
|
|
404
411
|
cd workflow-supervisor
|
|
412
|
+
npm install
|
|
405
413
|
npm run validate
|
|
406
414
|
```
|
|
407
415
|
|
|
408
|
-
|
|
416
|
+
## Basic Use
|
|
409
417
|
|
|
410
|
-
|
|
411
|
-
|
|
418
|
+
After installing the skills, ask your agent:
|
|
419
|
+
|
|
420
|
+
```text
|
|
421
|
+
Use $workflow-supervisor to implement a healthcare specialist FastAPI Naive RAG demo.
|
|
412
422
|
```
|
|
413
423
|
|
|
414
|
-
|
|
424
|
+
You should expect:
|
|
425
|
+
|
|
426
|
+
1. The supervisor asks the complete intake packet.
|
|
427
|
+
2. You answer every intake item.
|
|
428
|
+
3. If the path is `human_in_loop`, the supervisor gives you an approval packet before implementation.
|
|
429
|
+
4. The supervisor creates the source-requirement coverage ledger and `SPEC.md`.
|
|
430
|
+
5. You ask questions, request revisions, block, defer, or approve the SPEC.
|
|
431
|
+
6. After approval, the supervisor creates work units, acceptance rows, and dossiers.
|
|
432
|
+
7. The supervisor delegates scoped work to workers when supported.
|
|
433
|
+
8. Workers return structured reports.
|
|
434
|
+
9. The supervisor verifies, routes repairs if needed, and gives you the final result.
|
|
435
|
+
|
|
436
|
+
If you only want a normal quick edit, do not invoke `$workflow-supervisor`.
|
|
437
|
+
|
|
438
|
+
## CLI Reference
|
|
439
|
+
|
|
440
|
+
Common commands:
|
|
415
441
|
|
|
416
442
|
```bash
|
|
417
|
-
|
|
443
|
+
workflow-supervisor list
|
|
444
|
+
workflow-supervisor validate
|
|
445
|
+
workflow-supervisor doctor --agent all
|
|
446
|
+
workflow-supervisor install --agent codex --scope user
|
|
447
|
+
workflow-supervisor install --agent claude-code --scope user
|
|
448
|
+
workflow-supervisor install --agent all --scope project --project .
|
|
449
|
+
workflow-supervisor emit-context --agent generic --out AGENTS.md
|
|
450
|
+
workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
|
|
451
|
+
workflow-supervisor delegate --agent codex --role implementer --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-implementer.yaml
|
|
452
|
+
workflow-supervisor delegate --agent claude-code --role verifier --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-verifier.yaml
|
|
453
|
+
workflow-supervisor delegate-doctor --agent all --probe --require-pass
|
|
418
454
|
```
|
|
419
455
|
|
|
420
|
-
|
|
456
|
+
The package exposes two binary names:
|
|
421
457
|
|
|
422
|
-
```
|
|
423
|
-
|
|
458
|
+
```text
|
|
459
|
+
workflow-supervisor
|
|
460
|
+
workflow-skills
|
|
424
461
|
```
|
|
425
462
|
|
|
426
|
-
|
|
463
|
+
`workflow-skills` is kept as an alias. Prefer `workflow-supervisor` in user-facing instructions.
|
|
464
|
+
|
|
465
|
+
## Codex, Claude Code, And Generic Targets
|
|
427
466
|
|
|
428
|
-
|
|
467
|
+
Codex support uses:
|
|
468
|
+
|
|
469
|
+
- `SKILL.md`
|
|
470
|
+
- `agents/openai.yaml`
|
|
471
|
+
- the `codex` CLI adapter for delegated workers
|
|
472
|
+
|
|
473
|
+
Claude Code support uses:
|
|
474
|
+
|
|
475
|
+
- the same `SKILL.md` folders
|
|
476
|
+
- the `claude` CLI adapter for delegated workers
|
|
477
|
+
- optional emitted context through `CLAUDE.md`
|
|
478
|
+
|
|
479
|
+
The presence of `agents/openai.yaml` does not mean Claude Code is unsupported. It only means Codex has a specific metadata format.
|
|
480
|
+
|
|
481
|
+
Generic support is for custom Markdown-reading agent setups:
|
|
429
482
|
|
|
430
483
|
```bash
|
|
431
|
-
npx workflow-supervisor emit-context --agent generic --out AGENTS.md
|
|
484
|
+
npx workflow-supervisor emit-context --agent generic --skills workflow-supervisor,workflow-docs --out AGENTS.md
|
|
432
485
|
```
|
|
433
486
|
|
|
434
|
-
|
|
487
|
+
Generic is not a certified worker delegation target.
|
|
435
488
|
|
|
436
|
-
|
|
489
|
+
## Package Layout
|
|
437
490
|
|
|
438
491
|
```text
|
|
439
|
-
|
|
492
|
+
skills/ Skill instructions
|
|
493
|
+
skills/*/agents/ Agent metadata, including Codex openai.yaml files
|
|
494
|
+
schemas/ DossierV1 and WorkerReportV1 schemas
|
|
495
|
+
adapters/ Codex and Claude Code delegate command arrays
|
|
496
|
+
docs/ CLI, artifact, compatibility, and troubleshooting docs
|
|
497
|
+
assets/ README image assets
|
|
498
|
+
bin/workflow-skills.mjs Installer, validator, delegation wrapper, and command dispatch
|
|
440
499
|
```
|
|
441
500
|
|
|
442
|
-
The
|
|
501
|
+
The npm package includes:
|
|
502
|
+
|
|
503
|
+
```text
|
|
504
|
+
skills
|
|
505
|
+
adapters
|
|
506
|
+
schemas
|
|
507
|
+
docs
|
|
508
|
+
assets
|
|
509
|
+
bin
|
|
510
|
+
README.md
|
|
511
|
+
LICENSE
|
|
512
|
+
```
|
|
443
513
|
|
|
444
|
-
|
|
514
|
+
## Publishing Checklist
|
|
445
515
|
|
|
446
|
-
|
|
516
|
+
Before publishing:
|
|
447
517
|
|
|
448
518
|
```bash
|
|
449
|
-
|
|
450
|
-
|
|
451
|
-
workflow-supervisor delegate --agent codex --role implementer --unit U2 --dossier .workflow/dossiers/U2-implementer.yaml
|
|
452
|
-
workflow-supervisor delegate --agent claude-code --role verifier --unit U2 --dossier .workflow/dossiers/U2-verifier.yaml
|
|
453
|
-
workflow-supervisor delegate-doctor --agent all --probe --require-pass
|
|
519
|
+
npm run validate
|
|
520
|
+
npm pack --dry-run
|
|
454
521
|
```
|
|
455
522
|
|
|
456
|
-
|
|
523
|
+
`npm run validate` checks skill structure, adapter metadata, schema artifacts, and the test suite.
|
|
524
|
+
|
|
525
|
+
`npm pack --dry-run` shows exactly what will be included in the npm package.
|
|
526
|
+
|
|
527
|
+
The package also has:
|
|
528
|
+
|
|
529
|
+
- `prepublishOnly`: runs `npm run validate`
|
|
530
|
+
- `engines.node`: `>=18`
|
|
531
|
+
- `license`: `MIT`
|
|
532
|
+
- `bin.workflow-supervisor`: `bin/workflow-skills`
|
|
533
|
+
- `bin.workflow-skills`: `bin/workflow-skills`
|
|
534
|
+
|
|
535
|
+
## Rules For Agents Reading This README
|
|
457
536
|
|
|
458
|
-
If you are an
|
|
537
|
+
If you are an agent using this package:
|
|
459
538
|
|
|
460
539
|
1. Do not start work before complete intake.
|
|
461
|
-
2. Do not infer missing permissions from
|
|
462
|
-
3.
|
|
463
|
-
4. Do not
|
|
464
|
-
5.
|
|
465
|
-
6.
|
|
466
|
-
7.
|
|
467
|
-
8.
|
|
468
|
-
9.
|
|
469
|
-
|
|
470
|
-
|
|
540
|
+
2. Do not infer missing permissions from words like "autonomous", "generate", or "work until done".
|
|
541
|
+
3. If `$workflow-supervisor` is explicit, always create at least one work unit.
|
|
542
|
+
4. Do not delegate without a valid `DossierV1`.
|
|
543
|
+
5. Use separate worker agents when supported by the environment.
|
|
544
|
+
6. Do not silently collapse worker agents into same-session roleplay.
|
|
545
|
+
7. Treat same-session verification as `self-check`, not `independent-verifier`.
|
|
546
|
+
8. Trust only structured `WorkerReportV1` results from delegated workers.
|
|
547
|
+
9. Treat verifier edits as invalid.
|
|
548
|
+
10. Keep `.workflow/` ignored and local unless the user explicitly asks to publish it.
|
|
549
|
+
|
|
550
|
+
The promise is not magic autonomy. The promise is disciplined supervision: clear setup, bounded work, scoped workers, structured reports, evidence, repair, and a clean final handoff.
|