workflow-supervisor 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,212 +1,151 @@
1
1
  # Workflow Supervisor
2
2
 
3
- Workflow Supervisor is a small skill pack and npm helper for making coding agents handle complex work with discipline.
3
+ Workflow Supervisor is a strict supervision skill pack for agent work that needs to stay organized, resumable, and evidence-backed.
4
4
 
5
- It turns a vague request like:
5
+ It is for moments when you do not want an agent to jump straight into implementation, lose the thread halfway through, verify its own work, or quietly skip important handoffs. You ask for the supervisor, the supervisor asks the right setup questions, turns the work into small units, creates worker dossiers, delegates scoped work to real worker agents when possible, verifies the results, and leaves a clear outcome trail.
6
+
7
+ Example prompt:
6
8
 
7
9
  ```text
8
- Use workflow-supervisor to migrate the database from SQLite to LanceDB.
10
+ Use $workflow-supervisor to build a FastAPI Naive RAG demo for a healthcare specialist agent.
9
11
  ```
10
12
 
11
- into a supervised workflow with intake, source grounding, bounded work units, concrete worker dossiers, independent verification, repair loops, and evidence-backed output.
12
-
13
- It currently supports certified automated worker delegation for **Codex** and **Claude Code**.
13
+ The correct first response is not code. The correct first response is an intake packet. That is intentional.
14
14
 
15
15
  ![Workflow Supervisor hero image showing the supervisor coordinating source corpus, work units, dossiers, roles, loop policy, acceptance, repair, workflow docs, and final outputs](assets/workflow-supervisor-hero.png)
16
16
 
17
- ## What It Is
18
-
19
- Workflow Supervisor is not another agent product. It is a thin coordination layer for agents that already exist.
20
-
21
- The supervisor is the visible agent in the conversation. It owns the plan, asks the user questions, creates work units, validates worker contracts, launches workers, reads reports, and decides what happens next.
22
-
23
- Workers are short-lived CLI runs:
24
-
25
- ```bash
26
- workflow-supervisor delegate --agent codex --role implementer --unit U1 --dossier .workflow/dossiers/U1.yaml
27
- workflow-supervisor delegate --agent claude-code --role verifier --unit U1 --dossier .workflow/dossiers/U1.yaml
28
- ```
29
-
30
- Each worker gets only the context it needs. It returns one structured report. The supervisor remains the coordinator.
31
-
32
- ## The Moat
17
+ ## What You Get
33
18
 
34
- The moat is not a clever prompt. The moat is the set of gates that prevent agents from drifting.
19
+ Workflow Supervisor gives you a repeatable workflow for serious agent tasks:
35
20
 
36
- Workflow Supervisor enforces:
21
+ - a complete intake before work starts
22
+ - a source map, even when the only source is the user prompt
23
+ - a source-requirement coverage ledger so roadmap items and exit criteria cannot disappear
24
+ - a `SPEC.md` review gate where humans can ask questions, request revisions, block, defer, or approve before work units are finalized
25
+ - bounded work units, including `WU-001` for tiny tasks
26
+ - dossiers that tell each worker exactly what to do and what not to touch
27
+ - separate implementer, verifier, repair, and documenter responsibilities
28
+ - structured worker reports instead of loose prose
29
+ - evidence-based verification
30
+ - repair loops that stay tied to failed acceptance rows
31
+ - durable `.workflow/` state when the work needs to survive context loss
32
+ - a final report with checks, risks, workers, and next actions
37
33
 
38
- - complete intake before work starts
39
- - no keyword-based skipping
40
- - bounded work units before implementation
41
- - machine-checkable `DossierV1` before workers start
42
- - role separation between implementer, verifier, repair, and documenter
43
- - normalized `WorkerReportV1` output from every worker
44
- - evidence required for PASS
45
- - automatic BLOCKED reports for vague dossiers, missing CLIs, auth failures, invalid output, timeouts, forbidden edits, or verifier mutations
34
+ The main design choice is simple: the supervisor coordinates, workers do scoped work, and the CLI stays small.
46
35
 
47
- That means the system does not merely ask agents to behave. It blocks unsafe or vague execution before it happens.
36
+ ## The Mental Model
48
37
 
49
- ## What It Solves
38
+ Think of Workflow Supervisor as a project lead inside the current agent conversation.
50
39
 
51
- Large agent tasks fail in predictable ways:
40
+ The supervisor:
52
41
 
53
- - the agent starts before asking enough questions
54
- - "autonomous" or "use workflow supervisor" gets treated as permission to act
55
- - the context window fills with unrelated history
56
- - handoffs are vague
57
- - implementers verify their own work
58
- - verifiers say "looks good" without evidence
59
- - repair work expands scope
60
- - progress disappears after context compaction
61
- - every platform has a different output style
42
+ - asks the user for missing decisions
43
+ - decides when work is ready to start
44
+ - creates the plan, units, dossiers, and acceptance rows
45
+ - hands work to role-specific workers
46
+ - reads worker reports
47
+ - routes blockers and repairs
48
+ - decides when verification is good enough
49
+ - applies the final disposition policy
62
50
 
63
- Workflow Supervisor solves this by making the workflow explicit and resumable.
51
+ Workers:
64
52
 
65
- The conversation holds the supervisor. The `.workflow/` artifacts hold durable state. Workers get small, role-specific dossiers instead of the full conversation. Reports come back in one schema.
53
+ - receive one scoped dossier
54
+ - perform one role
55
+ - return one structured report
56
+ - do not talk to the human directly
57
+ - do not choose final disposition
58
+ - do not message each other
66
59
 
67
- ## Architecture
60
+ The CLI:
68
61
 
69
- Workflow Supervisor has two halves: a portable skill pack that teaches an agent how to supervise work, and a small npm CLI that installs those skills, validates contracts, and runs one-shot worker delegations.
62
+ - installs the skills
63
+ - validates skill and schema files
64
+ - validates `DossierV1`
65
+ - invokes one worker process
66
+ - validates `WorkerReportV1`
67
+ - returns a normalized report to the supervisor
70
68
 
71
- The current chat agent is always the supervisor. It owns intake, planning, source grounding, work-unit boundaries, dossiers, verification decisions, repair routing, and final disposition. The CLI never becomes a workflow daemon or queue. It is a helper that copies skills into supported agent directories, emits portable Markdown context, validates schema-backed artifacts, and invokes a single role-scoped worker process when delegation is authorized.
69
+ It is not a daemon, queue, dashboard, scheduler, or full agent harness.
72
70
 
73
71
  ```mermaid
74
72
  flowchart TB
75
- User["User"] --> Supervisor["Supervisor agent in current chat"]
76
- Supervisor --> Skills["Installed skills: SKILL.md and agent metadata"]
77
- Supervisor --> State["Durable state: .workflow/ in the target workspace"]
78
- Supervisor --> CLI["workflow-supervisor CLI: bin/workflow-skills.mjs"]
79
-
80
- subgraph Package["npm package"]
81
- SkillsSource["skills/"]
82
- AdapterDefs["adapters/"]
83
- SchemaDefs["schemas/"]
84
- Docs["docs/"]
85
- CLI
86
- end
87
-
88
- CLI --> SkillsSource
89
- CLI --> AdapterDefs
90
- CLI --> SchemaDefs
91
- CLI --> Docs
92
- CLI --> Adapters["Adapter command array"]
93
- Adapters --> Codex["Codex CLI worker"]
94
- Adapters --> Claude["Claude Code CLI worker"]
95
- Codex --> Report["WorkerReportV1 JSON"]
73
+ User["User"] --> Supervisor["Supervisor agent in the current chat"]
74
+ Supervisor --> Intake["Complete intake"]
75
+ Supervisor --> Sources["Source corpus"]
76
+ Supervisor --> Units["Work units"]
77
+ Supervisor --> Matrix["Acceptance matrix"]
78
+ Supervisor --> Dossiers["DossierV1 files"]
79
+ Supervisor --> CLI["workflow-supervisor CLI"]
80
+ CLI --> Codex["Codex worker"]
81
+ CLI --> Claude["Claude Code worker"]
82
+ Codex --> Report["WorkerReportV1"]
96
83
  Claude --> Report
97
84
  Report --> Supervisor
85
+ Supervisor --> Docs[".workflow/ state"]
86
+ Supervisor --> Outcome["Final report"]
98
87
  ```
99
88
 
100
- The package layout is intentionally simple:
89
+ ## What Happens When You Invoke It
101
90
 
102
- - `skills/` contains the opt-in supervisor skills and OpenAI metadata prompts.
103
- - `bin/workflow-skills.mjs` contains the installer, validator, context emitter, delegation wrapper, surface guard, and command dispatch.
104
- - `schemas/` defines `DossierV1` and `WorkerReportV1`.
105
- - `adapters/` defines certified Codex and Claude Code command arrays.
106
- - `docs/` explains CLI usage, portable delegation semantics, compatibility, artifacts, and troubleshooting.
107
- - `.workflow/` is created in consuming projects as private supervisor working memory, not as package state.
91
+ When you explicitly invoke `workflow-supervisor`, `$workflow-supervisor`, or say to use the skill, the workflow enters `strict_full_workflow`.
108
92
 
109
- ```mermaid
110
- flowchart LR
111
- Package["workflow-supervisor package"] --> Install["install"]
112
- Package --> Emit["emit-context"]
113
- Package --> Validate["validate and validate-dossier"]
114
- Package --> Delegate["delegate and delegate-doctor"]
115
-
116
- Install --> CodexTarget["Codex target: ~/.agents/skills or project .agents/skills"]
117
- Install --> ClaudeTarget["Claude target: ~/.claude/skills or project .claude/skills"]
118
- Install --> Gitignore["Project .gitignore contains .workflow/"]
119
-
120
- Emit --> PortableFile["Portable context file: AGENTS.md or CLAUDE.md"]
121
- Validate --> SkillGate["Skill, schema, adapter, and dossier gates"]
122
- Delegate --> WorkerRun["One role-scoped worker CLI process"]
123
- ```
93
+ Strict mode means task size does not matter. Even if the request is "make a function that adds two numbers", explicit supervisor invocation still means the full workflow:
124
94
 
125
- Delegation is a guarded subprocess, not an open-ended conversation between agents. The supervisor creates a concrete `DossierV1`, the CLI validates it before any worker starts, the adapter receives a role-scoped prompt and the `WorkerReportV1` schema, and the wrapper normalizes failure modes into structured `BLOCKED` reports.
95
+ 1. Ask the complete intake packet.
96
+ 2. Build or record the source corpus.
97
+ 3. Create a source-requirement coverage ledger.
98
+ 4. Create a `SPEC.md` review packet or file.
99
+ 5. Pause for human Q&A, revisions, block, defer, or approval when the path is human-in-loop.
100
+ 6. Create at least one work unit.
101
+ 7. Create acceptance rows that preserve source-scope fidelity.
102
+ 8. Create dossiers for the planned workers.
103
+ 9. Create a worker-agent plan.
104
+ 10. Ask for approval when the selected path is human-in-loop.
105
+ 11. Delegate scoped work to real workers when the environment supports it.
106
+ 12. Verify with evidence.
107
+ 13. Route repair work if verification fails.
108
+ 14. Refresh docs or outcome state.
109
+ 15. Report final status and next action.
126
110
 
127
- ```mermaid
128
- sequenceDiagram
129
- participant S as Supervisor
130
- participant C as "workflow-supervisor delegate"
131
- participant D as "DossierV1 validator"
132
- participant G as "Surface guard"
133
- participant A as "Agent adapter"
134
- participant W as "Worker CLI"
135
-
136
- S->>C: Role, unit ID, workspace, dossier path
137
- C->>D: Parse JSON, YAML, or fenced YAML
138
- D-->>C: Valid dossier or BLOCKED invalid_dossier
139
- C->>G: Snapshot git status or explicit surfaces
140
- C->>A: Build command from adapters/<agent>/adapter.json
141
- A->>W: Run one CLI process with role prompt and schema
142
- W-->>A: stdout, stderr, exit code, timeout signal
143
- A-->>C: Raw worker output
144
- C->>C: Extract and validate WorkerReportV1
145
- C->>G: Compare after-state against allowed and forbidden surfaces
146
- C-->>S: PASS, FAIL, or normalized BLOCKED WorkerReportV1
147
- ```
111
+ This rule exists to prevent the agent from deciding that a task is "too simple" and quietly skipping the supervisor.
148
112
 
149
- The supervisor loop is therefore stateful at the workflow level but stateless at the worker level. Every worker run is fresh, bounded by one dossier, and reduced back to one report before the supervisor decides the next step.
113
+ ## Intake
150
114
 
151
- ```mermaid
152
- stateDiagram-v2
153
- [*] --> Intake
154
- Intake --> SourceGrounding: Complete intake
155
- SourceGrounding --> WorkUnits: Sources ranked
156
- WorkUnits --> AcceptanceMatrix: Units bounded
157
- AcceptanceMatrix --> Dossier: Evidence rows ready
158
- Dossier --> Delegation: DossierV1 valid
159
- Delegation --> Verification: WorkerReportV1 returned
160
- Verification --> Repair: FAIL or actionable BLOCKED
161
- Repair --> Verification: Repair report returned
162
- Verification --> Documentation: PASS with evidence
163
- Documentation --> FinalDisposition: Outcome recorded
164
- FinalDisposition --> [*]
165
- Dossier --> Intake: Missing decision
166
- Delegation --> Intake: Worker BLOCKED with human question
167
- ```
115
+ The supervisor must get explicit answers to these seven items before planning deeply, creating a goal, delegating workers, implementing, publishing, or taking irreversible action:
168
116
 
169
- ## What It Is Used For
170
-
171
- Use it for work that is:
172
-
173
- - broad or ambiguous
174
- - multi-step
175
- - high-risk
176
- - likely to need repair loops
177
- - likely to exceed one context window
178
- - important enough to require independent verification
179
- - easier to handle as several bounded units
180
-
181
- Good examples:
117
+ ```text
118
+ 1. Objective and source: what artifact, spec, repo path, document, ticket, or source set controls the work?
119
+ 2. Execution path: autonomous_goal or human_in_loop?
120
+ 3. Mode: sequential, parallel where safe, or staged parallel?
121
+ 4. Delegation: automated worker delegation, native threads/subagents if available, or same-session phased?
122
+ 5. Final disposition: keep local, open PR, push main, deploy/publish, or ask at the end?
123
+ 6. Boundaries: may I install dependencies, call external services, use credentials, or only edit local files?
124
+ 7. State artifacts: create .workflow docs, use another artifact directory, or keep state inline?
125
+ ```
182
126
 
183
- - migrate SQLite storage to LanceDB
184
- - refactor authentication across several modules
185
- - update docs from a new API spec
186
- - implement a feature with tests and verification
187
- - review and repair a messy PR
188
- - produce durable workflow docs for a long-running task
127
+ If any answer is missing or vague, the supervisor asks only for the missing pieces and stops. Phrases like "work autonomously", "just do it", or "use your judgment" do not fill in the missing intake fields.
189
128
 
190
- Do not use it for:
129
+ Expected human pauses are normal. A workflow can move from `WAITING_FOR_HUMAN` back to `ACTIVE` after the user approves a plan or answers a blocker question.
191
130
 
192
- - tiny edits
193
- - one-off shell commands
194
- - obvious single-file changes
195
- - quick explanations
196
- - tasks where a normal agent turn is enough
131
+ In `autonomous_goal`, a human clarification pause is not automatically a terminal failed goal. The supervisor records the blocker, asks the smallest needed question, updates SPEC/Q&A/coverage state when the answer arrives, refreshes only affected downstream artifacts, and resumes from the recorded next action. If an old Codex goal was already terminal-blocked, the resumed workflow references it as history and continues from workflow state or a newly authorized goal binding.
197
132
 
198
- ## How It Works
133
+ ## The Workflow
199
134
 
200
- The lifecycle is:
135
+ The full loop looks like this:
201
136
 
202
137
  ```text
203
- intake
204
- -> source grounding
138
+ complete intake
139
+ -> source corpus
140
+ -> source-requirement coverage ledger
141
+ -> SPEC review and Q&A gate
205
142
  -> work units
143
+ -> loop policy
206
144
  -> acceptance matrix
207
- -> DossierV1
208
- -> worker delegation
209
- -> WorkerReportV1
145
+ -> dossiers
146
+ -> approval or autonomous path gate
147
+ -> worker handoff
148
+ -> worker report
210
149
  -> verification
211
150
  -> repair if needed
212
151
  -> re-verification
@@ -214,257 +153,398 @@ intake
214
153
  -> final disposition
215
154
  ```
216
155
 
217
- ### 1. Intake
218
-
219
- The supervisor must ask the user for every required decision before it plans deeply or starts work:
156
+ The worker lifecycle is tracked separately:
220
157
 
221
158
  ```text
222
- 1. Objective and source
223
- 2. Execution path: autonomous_goal or human_in_loop
224
- 3. Mode: sequential, parallel where safe, or staged parallel
225
- 4. Delegation: automated workers, native subagents if available, or same-session phased
226
- 5. Final disposition: keep local, open PR, push, deploy, publish, or ask at end
227
- 6. Boundaries: installs, network, credentials, destructive operations, forbidden surfaces
228
- 7. State artifacts: .workflow docs, another directory, or inline state
159
+ planned -> handed_off -> acknowledged -> reported -> verified -> closed
229
160
  ```
230
161
 
231
- If any answer is missing or vague, the supervisor asks again and stops.
162
+ This makes it possible to see where the workflow is, which worker owns which piece, what evidence exists, and what should happen next.
163
+
164
+ For source-of-truth builds, the coverage ledger is the guardrail against "green but incomplete" outcomes. Every material source requirement must be mapped to a work unit and acceptance row, explicitly deferred by the user, blocked as a scope decision, or marked non-material with a reason. Residual risks and future-work notes cannot contain unimplemented material source requirements in a PASS workflow.
165
+
166
+ `SPEC.md` is the human review contract before final work units. In human-in-loop mode, the supervisor stops at the draft SPEC so the human can ask questions, request revisions, mark items deferred, block the workflow, or approve. The workflow continues only after explicit approval.
232
167
 
233
- ### 2. Source Grounding
168
+ When a workflow pauses for a human decision, the decision is recorded as state rather than treated as a restart. The next supervisor pass updates the affected coverage rows, SPEC fields, work units, acceptance rows, dossiers, or verification results, invalidates stale artifacts, and continues from the saved `Next Action`.
234
169
 
235
- The supervisor identifies the source of truth: files, specs, docs, tickets, user decisions, commands, or external constraints.
170
+ ## Skills In The Pack
236
171
 
237
- If source authority is unclear, the first work unit becomes discovery instead of implementation.
172
+ The skill pack is made of small focused skills. The supervisor can use them as phase instructions.
238
173
 
239
- ### 3. Work Units
174
+ | Skill | What it does |
175
+ |---|---|
176
+ | `workflow-supervisor` | Coordinates the whole workflow, gates, workers, verification, repair, and final disposition. |
177
+ | `source-corpus` | Lists and ranks sources, gaps, contradictions, authority, freshness, and allowed next action. |
178
+ | `work-unit` | Turns the objective into bounded units with dependencies, surfaces, readiness, and done criteria. |
179
+ | `loop-policy` | Defines execution path, mode, approval gates, repair limits, budgets, goal policy, and resume behavior. |
180
+ | `acceptance-matrix` | Turns requirements into evidence rows with PASS, FAIL, BLOCKED, and waiver handling. |
181
+ | `dossier-builder` | Creates concrete `DossierV1` contracts for workers. |
182
+ | `worker-roles` | Defines role boundaries so implementers, verifiers, repair authors, and documenters do not blur together. |
183
+ | `workflow-docs` | Creates or refreshes durable `.workflow/` artifacts when state needs to persist. |
240
184
 
241
- The supervisor splits the objective into bounded units.
185
+ Loading a skill does not spawn a worker. A skill is instruction context for the current supervisor. A worker is a separate role-scoped execution run.
242
186
 
243
- For a SQLite to LanceDB migration, units might be:
187
+ ## Files The Workflow Creates
188
+
189
+ Workflow state lives under `.workflow/` by default. The directory is local supervisor memory, not product code.
190
+
191
+ In a Git-backed project, `.workflow/` must be in `.gitignore` before these files are written. Project installs do this automatically.
192
+
193
+ Common workflow files:
194
+
195
+ | File | Created from | Purpose |
196
+ |---|---|---|
197
+ | `.workflow/WORKFLOW.md` | `workflow-supervisor`, `loop-policy`, `workflow-docs` | Main state, objective, execution path, policy, stop gates, next action. |
198
+ | `.workflow/SOURCE-CORPUS.md` | `source-corpus`, `workflow-docs` | Source ranking, missing sources, contradictions, assumptions. |
199
+ | `.workflow/SPEC.md` | `workflow-supervisor`, `source-corpus`, `workflow-docs` | Human-reviewable interpretation, requirement coverage, Q&A, and approval decision before work units. |
200
+ | `.workflow/WORK-UNITS.md` | `work-unit`, `workflow-docs` | Unit list, dependencies, sequencing, blocked units. |
201
+ | `.workflow/DOSSIER.md` or `.workflow/dossiers/*.yaml` | `dossier-builder`, `workflow-docs` | Worker contracts for implementation, verification, repair, or documentation. |
202
+ | `.workflow/WORKER-MAP.md` | `workflow-supervisor`, `worker-roles`, `workflow-docs` | Worker names, roles, transports, lifecycle, reports, blockers. |
203
+ | `.workflow/ACCEPTANCE-MATRIX.md` | `acceptance-matrix`, `workflow-docs` | Evidence rows and material PASS, FAIL, BLOCKED states. |
204
+ | `.workflow/VERIFICATION-REPORT.md` | verifier worker, `acceptance-matrix`, `workflow-docs` | Verification evidence, findings, skipped checks, residual risks. |
205
+ | `.workflow/REPAIR-TICKETS.md` | repair worker, `workflow-docs` | Repair tasks tied to failed rows or verifier findings. |
206
+ | `.workflow/DECISIONS.md` | supervisor, `workflow-docs` | User decisions, assumptions, reversals, unresolved questions. |
207
+ | `.workflow/HANDOFF.md` | supervisor, `workflow-docs` | Resume pack for another agent or later session. |
208
+ | `.workflow/OUTCOME.md` | supervisor, documenter worker, `workflow-docs` | Final status, checks, risks, disposition, next action. |
209
+ | `.workflow/GOAL-STATE.md` | supervisor, `workflow-docs` | Codex goal mirror, blocked-goal history, human-decision resume checkpoint, and durable backup. |
210
+
211
+ For documentation-heavy workflows, `workflow-docs` can also create:
244
212
 
245
213
  ```text
246
- U1 dependency and config
247
- U2 storage adapter
248
- U3 data migration path
249
- U4 tests and regression checks
250
- U5 docs and outcome report
214
+ .workflow/DOCUMENTATION-BRIEF.md
215
+ .workflow/CONTENT-INVENTORY.md
216
+ .workflow/OUTLINE.md
217
+ .workflow/CONTENT-DRAFT.md
218
+ .workflow/CLAIMS-REGISTER.md
219
+ .workflow/STYLE-GUIDE.md
220
+ .workflow/GLOSSARY.md
221
+ .workflow/ASSET-REGISTER.md
222
+ .workflow/REVIEW-PLAN.md
223
+ .workflow/REVISION-QUEUE.md
224
+ .workflow/PUBLISHING-CHECKLIST.md
225
+ .workflow/PUBLICATION-LOG.md
226
+ .workflow/MAINTENANCE-PLAN.md
251
227
  ```
252
228
 
253
- ### 4. DossierV1
229
+ It should not create all of these by default. It should create the smallest useful set.
230
+
231
+ ## Dossiers
254
232
 
255
- Before any worker starts, the supervisor creates a concrete `DossierV1`.
233
+ A dossier is the worker contract. It is how the supervisor prevents vague delegation.
256
234
 
257
- A dossier names:
235
+ Before any worker starts, the supervisor creates a concrete `DossierV1` with:
258
236
 
259
- - the exact work unit
260
- - the worker role
237
+ - workflow name
238
+ - work unit
239
+ - dossier id
240
+ - worker name
241
+ - worker role
242
+ - delegation transport
243
+ - start condition
244
+ - objective and non-goals
245
+ - source corpus and must-read sources
261
246
  - allowed surfaces
262
247
  - forbidden surfaces
263
- - must-read sources
264
248
  - acceptance rows
265
- - required evidence
266
249
  - adversarial checks
250
+ - required commands or evidence
251
+ - worker prompt
252
+ - supervisor checkpoints
253
+ - report schemas
267
254
  - stop gates
268
- - required report schema
255
+ - assumptions
256
+ - open questions
269
257
 
270
- Then the package validates it:
258
+ Validate a dossier before delegation:
271
259
 
272
260
  ```bash
273
- workflow-supervisor validate-dossier .workflow/dossiers/U2-implementer.yaml --role implementer --unit U2 --json
261
+ workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
274
262
  ```
275
263
 
276
- If the dossier says `all files`, `TBD`, `unknown`, `as needed`, or leaves open questions unresolved, it fails. No worker starts.
264
+ The validator rejects things like `TBD`, `unknown`, `all files`, `entire repo`, unresolved open questions, role mismatches, unit mismatches, missing forbidden surfaces, and prompts that do not require `WorkerReportV1`.
277
265
 
278
- ### 5. Worker Delegation
266
+ ## Workers
279
267
 
280
- The supervisor launches one role-scoped worker through Codex or Claude Code.
268
+ The required worker responsibilities are:
281
269
 
282
- ```bash
283
- workflow-supervisor delegate --agent codex --role implementer --unit U2 --dossier .workflow/dossiers/U2-implementer.yaml
284
- ```
270
+ | Responsibility | CLI role value | What it does |
271
+ |---|---|---|
272
+ | Implementer | `implementer` | Changes only the allowed surfaces named in the dossier. |
273
+ | Verifier | `verifier` | Checks the work against acceptance rows and must not edit implementation. |
274
+ | Repair author | `repair` | Converts failed rows or verifier findings into actionable repair work. |
275
+ | Documenter | `documenter` | Updates workflow or outcome docs from evidence. |
276
+
277
+ The skill text may say "repair-author" because that is the human role. The CLI schema uses `repair`.
278
+
279
+ Workers receive only their scoped handoff:
280
+
281
+ - role
282
+ - dossier
283
+ - sources
284
+ - acceptance rows
285
+ - stop gates
286
+ - report schema
287
+
288
+ They return one terminal `WorkerReportV1`.
289
+
290
+ ## Worker Reports
285
291
 
286
- The worker does not get the whole chat. It receives the dossier and report schema.
292
+ Every delegated worker returns this machine-shaped report:
293
+
294
+ ```json
295
+ {
296
+ "schema": "WorkerReportV1",
297
+ "status": "PASS",
298
+ "role": "verifier",
299
+ "unit_id": "WU-001",
300
+ "summary": "Verified the API responses and retrieval behavior against the acceptance rows.",
301
+ "changed_surfaces": [],
302
+ "evidence": ["pytest tests/test_api.py passed", "manual inspection of /health response"],
303
+ "checks_run": ["pytest tests/test_api.py"],
304
+ "skipped_checks": [],
305
+ "findings": [],
306
+ "blocking_question": null,
307
+ "next_action": "supervisor_review",
308
+ "adapter": null,
309
+ "guard": null,
310
+ "reason": null
311
+ }
312
+ ```
287
313
 
288
- ### 6. Verification And Repair
314
+ The supervisor trusts the report shape, not loose prose. A PASS without evidence is invalid. A verifier that edits implementation is invalid. A worker that asks the human directly is converted into a blocker for the supervisor to route.
289
315
 
290
- A verifier checks the implementer's work against acceptance rows.
316
+ ## How The Supervisor Talks To Workers
291
317
 
292
- If verification fails, the supervisor creates repair work. Repairs must point back to verifier findings or acceptance rows. After repair, verification runs again.
318
+ The portable worker path is one CLI command:
293
319
 
294
- ### 7. Final Disposition
320
+ ```bash
321
+ workflow-supervisor delegate \
322
+ --agent <codex|claude-code> \
323
+ --role <implementer|verifier|repair|documenter> \
324
+ --unit <unit-id> \
325
+ --cwd <workspace> \
326
+ --dossier <path>
327
+ ```
295
328
 
296
- The supervisor applies the final disposition chosen during intake:
329
+ The command:
297
330
 
298
- - keep changes local
299
- - open a PR
300
- - push
301
- - deploy
302
- - publish
303
- - ask at the end
331
+ 1. Validates the dossier as `DossierV1`.
332
+ 2. Builds a scoped worker prompt.
333
+ 3. Starts the selected agent CLI with an adapter command array.
334
+ 4. Captures stdout, stderr, exit code, and timeout.
335
+ 5. Extracts and validates `WorkerReportV1`.
336
+ 6. Runs surface and role guards.
337
+ 7. Prints one normalized JSON report for the supervisor.
304
338
 
305
- No final irreversible action is inferred from vibes.
339
+ Certified worker adapters:
306
340
 
307
- ## What The User Sees
341
+ - `codex`
342
+ - `claude-code`
308
343
 
309
- The user sees the supervisor, not worker chatter.
344
+ The `generic` target is for Markdown instruction export. It is not a certified automated worker adapter.
310
345
 
311
- In `human_in_loop`, the user sees:
346
+ Check local adapter readiness:
312
347
 
313
- ```text
314
- intake question
315
- approval packet
316
- progress summaries
317
- blocker questions if needed
318
- final report
348
+ ```bash
349
+ workflow-supervisor delegate-doctor --agent all --probe --require-pass
319
350
  ```
320
351
 
321
- In `autonomous_goal`, the user sees:
352
+ If a worker adapter is missing, unauthenticated, times out, returns invalid output, edits forbidden surfaces, or returns PASS without evidence, the delegate command returns a structured `BLOCKED` report.
353
+
354
+ ## No Silent Fallbacks
355
+
356
+ If the environment can create, message, or delegate to worker agents, the supervisor must use real workers for implementation, verification, repair, and documentation responsibilities.
357
+
358
+ If it cannot, it must record:
322
359
 
323
360
  ```text
324
- intake question
325
- execution plan
326
- periodic progress summaries
327
- blockers only when needed
328
- final report
361
+ worker_agent_unavailable
329
362
  ```
330
363
 
331
- Workers do not ask the user questions directly. They return `BLOCKED` and the supervisor decides how to route it.
364
+ Then it must stop for a human decision unless complete intake explicitly selected `same_session_phased`.
332
365
 
333
- ## What The Output Is
366
+ Same-session phased work is allowed only when selected. Verification in that mode is a `self-check`, not an `independent-verifier`.
334
367
 
335
- Workflow Supervisor produces three kinds of output.
368
+ ## Install
336
369
 
337
- ### 1. Durable Workflow State
370
+ Install from npm once published:
371
+
372
+ ```bash
373
+ npm install -g workflow-supervisor
374
+ workflow-supervisor validate
375
+ ```
338
376
 
339
- Usually under `.workflow/`:
377
+ Use with `npx`:
340
378
 
341
- ```text
342
- WORKFLOW.md
343
- SOURCE-CORPUS.md
344
- WORK-UNITS.md
345
- DOSSIER.md
346
- WORKER-MAP.md
347
- ACCEPTANCE-MATRIX.md
348
- VERIFICATION-REPORT.md
349
- REPAIR-TICKETS.md
350
- DECISIONS.md
351
- HANDOFF.md
352
- OUTCOME.md
353
- GOAL-STATE.md
379
+ ```bash
380
+ npx workflow-supervisor list
354
381
  ```
355
382
 
356
- These files make the workflow resumable after context compaction or handoff.
383
+ Install skills for Codex:
357
384
 
358
- ### 2. Worker Reports
385
+ ```bash
386
+ npx workflow-supervisor install --agent codex --scope user
387
+ ```
359
388
 
360
- Every worker returns `WorkerReportV1`:
389
+ Install skills for Claude Code:
361
390
 
362
- ```json
363
- {
364
- "schema": "WorkerReportV1",
365
- "status": "PASS",
366
- "role": "verifier",
367
- "unit_id": "U2",
368
- "summary": "Verified LanceDB-backed search path.",
369
- "changed_surfaces": [],
370
- "evidence": ["pytest tests/test_search.py passed"],
371
- "checks_run": ["pytest tests/test_search.py"],
372
- "skipped_checks": [],
373
- "findings": [],
374
- "blocking_question": null,
375
- "next_action": "supervisor_review",
376
- "adapter": null,
377
- "guard": null,
378
- "reason": null
379
- }
391
+ ```bash
392
+ npx workflow-supervisor install --agent claude-code --scope user
380
393
  ```
381
394
 
382
- ### 3. Final Supervisor Report
395
+ Install both certified targets into a project:
383
396
 
384
- The final report names:
397
+ ```bash
398
+ npx workflow-supervisor install --agent all --scope project --project .
399
+ ```
385
400
 
386
- - execution path
387
- - goal status
388
- - sources used
389
- - work units completed
390
- - workers delegated
391
- - checks run
392
- - skipped checks
393
- - repairs performed
394
- - residual risks
395
- - final disposition
396
- - next action
401
+ Project installs copy the skill folders into project-level agent directories and ensure the target project `.gitignore` contains:
397
402
 
398
- ## How To Install
403
+ ```gitignore
404
+ .workflow/
405
+ ```
399
406
 
400
407
  From a local checkout:
401
408
 
402
409
  ```bash
403
410
  git clone https://github.com/NikolaCehic/workflow-supervisor.git
404
411
  cd workflow-supervisor
412
+ npm install
405
413
  npm run validate
406
414
  ```
407
415
 
408
- Install for Codex:
416
+ ## Basic Use
409
417
 
410
- ```bash
411
- npx workflow-supervisor install --agent codex --scope user
418
+ After installing the skills, ask your agent:
419
+
420
+ ```text
421
+ Use $workflow-supervisor to implement a healthcare specialist FastAPI Naive RAG demo.
412
422
  ```
413
423
 
414
- Install for Claude Code:
424
+ You should expect:
425
+
426
+ 1. The supervisor asks the complete intake packet.
427
+ 2. You answer every intake item.
428
+ 3. If the path is `human_in_loop`, the supervisor gives you an approval packet before implementation.
429
+ 4. The supervisor creates the source-requirement coverage ledger and `SPEC.md`.
430
+ 5. You ask questions, request revisions, block, defer, or approve the SPEC.
431
+ 6. After approval, the supervisor creates work units, acceptance rows, and dossiers.
432
+ 7. The supervisor delegates scoped work to workers when supported.
433
+ 8. Workers return structured reports.
434
+ 9. The supervisor verifies, routes repairs if needed, and gives you the final result.
435
+
436
+ If you only want a normal quick edit, do not invoke `$workflow-supervisor`.
437
+
438
+ ## CLI Reference
439
+
440
+ Common commands:
415
441
 
416
442
  ```bash
417
- npx workflow-supervisor install --agent claude-code --scope user
443
+ workflow-supervisor list
444
+ workflow-supervisor validate
445
+ workflow-supervisor doctor --agent all
446
+ workflow-supervisor install --agent codex --scope user
447
+ workflow-supervisor install --agent claude-code --scope user
448
+ workflow-supervisor install --agent all --scope project --project .
449
+ workflow-supervisor emit-context --agent generic --out AGENTS.md
450
+ workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
451
+ workflow-supervisor delegate --agent codex --role implementer --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-implementer.yaml
452
+ workflow-supervisor delegate --agent claude-code --role verifier --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-verifier.yaml
453
+ workflow-supervisor delegate-doctor --agent all --probe --require-pass
418
454
  ```
419
455
 
420
- Install for both in a project:
456
+ The package exposes two binary names:
421
457
 
422
- ```bash
423
- npx workflow-supervisor install --agent all --scope project --project .
458
+ ```text
459
+ workflow-supervisor
460
+ workflow-skills
424
461
  ```
425
462
 
426
- Project installs also add `.workflow/` to the target project's `.gitignore`. Workflow state is local working memory by default; it should not be pushed with the consuming codebase unless the user explicitly chooses that.
463
+ `workflow-skills` is kept as an alias. Prefer `workflow-supervisor` in user-facing instructions.
464
+
465
+ ## Codex, Claude Code, And Generic Targets
427
466
 
428
- Export generic Markdown instructions:
467
+ Codex support uses:
468
+
469
+ - `SKILL.md`
470
+ - `agents/openai.yaml`
471
+ - the `codex` CLI adapter for delegated workers
472
+
473
+ Claude Code support uses:
474
+
475
+ - the same `SKILL.md` folders
476
+ - the `claude` CLI adapter for delegated workers
477
+ - optional emitted context through `CLAUDE.md`
478
+
479
+ The presence of `agents/openai.yaml` does not mean Claude Code is unsupported. It only means Codex has a specific metadata format.
480
+
481
+ Generic support is for custom Markdown-reading agent setups:
429
482
 
430
483
  ```bash
431
- npx workflow-supervisor emit-context --agent generic --out AGENTS.md
484
+ npx workflow-supervisor emit-context --agent generic --skills workflow-supervisor,workflow-docs --out AGENTS.md
432
485
  ```
433
486
 
434
- ## How To Use
487
+ Generic is not a certified worker delegation target.
435
488
 
436
- In Codex or Claude Code, ask explicitly:
489
+ ## Package Layout
437
490
 
438
491
  ```text
439
- Use $workflow-supervisor to migrate this repo from SQLite to LanceDB.
492
+ skills/ Skill instructions
493
+ skills/*/agents/ Agent metadata, including Codex openai.yaml files
494
+ schemas/ DossierV1 and WorkerReportV1 schemas
495
+ adapters/ Codex and Claude Code delegate command arrays
496
+ docs/ CLI, artifact, compatibility, and troubleshooting docs
497
+ assets/ README image assets
498
+ bin/workflow-skills.mjs Installer, validator, delegation wrapper, and command dispatch
440
499
  ```
441
500
 
442
- The correct next step is not implementation. The correct next step is the intake packet.
501
+ The npm package includes:
502
+
503
+ ```text
504
+ skills
505
+ adapters
506
+ schemas
507
+ docs
508
+ assets
509
+ bin
510
+ README.md
511
+ LICENSE
512
+ ```
443
513
 
444
- After intake is complete, the supervisor proceeds through work units, dossiers, workers, verification, repair, and final disposition.
514
+ ## Publishing Checklist
445
515
 
446
- Useful CLI commands:
516
+ Before publishing:
447
517
 
448
518
  ```bash
449
- workflow-supervisor validate
450
- workflow-supervisor validate-dossier .workflow/dossiers/U2-implementer.yaml --role implementer --unit U2 --json
451
- workflow-supervisor delegate --agent codex --role implementer --unit U2 --dossier .workflow/dossiers/U2-implementer.yaml
452
- workflow-supervisor delegate --agent claude-code --role verifier --unit U2 --dossier .workflow/dossiers/U2-verifier.yaml
453
- workflow-supervisor delegate-doctor --agent all --probe --require-pass
519
+ npm run validate
520
+ npm pack --dry-run
454
521
  ```
455
522
 
456
- ## For Agents
523
+ `npm run validate` checks skill structure, adapter metadata, schema artifacts, and the test suite.
524
+
525
+ `npm pack --dry-run` shows exactly what will be included in the npm package.
526
+
527
+ The package also has:
528
+
529
+ - `prepublishOnly`: runs `npm run validate`
530
+ - `engines.node`: `>=18`
531
+ - `license`: `MIT`
532
+ - `bin.workflow-supervisor`: `bin/workflow-skills`
533
+ - `bin.workflow-skills`: `bin/workflow-skills`
534
+
535
+ ## Rules For Agents Reading This README
457
536
 
458
- If you are an AI agent reading this README, follow these rules:
537
+ If you are an agent using this package:
459
538
 
460
539
  1. Do not start work before complete intake.
461
- 2. Do not infer missing permissions from user intent.
462
- 3. Do not delegate without a valid `DossierV1`.
463
- 4. Do not accept worker prose as completion.
464
- 5. Trust only `WorkerReportV1` reports.
465
- 6. Treat verifier edits as invalid.
466
- 7. Ask the user only through the supervisor.
467
- 8. Before creating `.workflow/` in a codebase, ensure `.gitignore` contains `.workflow/`.
468
- 9. Preserve enough `.workflow/` state for another agent to resume, but do not stage or publish it unless explicitly requested.
469
-
470
- The point is not to add ceremony. The point is to make complex agent work hard to derail.
540
+ 2. Do not infer missing permissions from words like "autonomous", "generate", or "work until done".
541
+ 3. If `$workflow-supervisor` is explicit, always create at least one work unit.
542
+ 4. Do not delegate without a valid `DossierV1`.
543
+ 5. Use separate worker agents when supported by the environment.
544
+ 6. Do not silently collapse worker agents into same-session roleplay.
545
+ 7. Treat same-session verification as `self-check`, not `independent-verifier`.
546
+ 8. Trust only structured `WorkerReportV1` results from delegated workers.
547
+ 9. Treat verifier edits as invalid.
548
+ 10. Keep `.workflow/` ignored and local unless the user explicitly asks to publish it.
549
+
550
+ The promise is not magic autonomy. The promise is disciplined supervision: clear setup, bounded work, scoped workers, structured reports, evidence, repair, and a clean final handoff.