npm - workflow-supervisor - Versions diffs - 0.1.1 → 0.1.3 - Mend

workflow-supervisor 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/README.md +397 -317
package/bin/workflow-skills.mjs +3 -2
package/docs/skill-reference.md +5 -5
package/docs/troubleshooting.md +16 -0
package/package.json +1 -1
package/skills/acceptance-matrix/SKILL.md +29 -2
package/skills/loop-policy/SKILL.md +5 -2
package/skills/work-unit/SKILL.md +19 -0
package/skills/workflow-docs/SKILL.md +2 -1
package/skills/workflow-docs/references/goal-resume.md +48 -3
package/skills/workflow-docs/references/templates.md +2 -0
package/skills/workflow-docs/references/workflow-control.md +149 -0
package/skills/workflow-supervisor/SKILL.md +174 -27
package/skills/workflow-supervisor/agents/openai.yaml +1 -1

package/README.md CHANGED Viewed

@@ -1,212 +1,151 @@
 # Workflow Supervisor
-Workflow Supervisor is a small skill pack and npm helper for making coding agents handle complex work with discipline.
+Workflow Supervisor is a strict supervision skill pack for agent work that needs to stay organized, resumable, and evidence-backed.
-It turns a vague request like:
+It is for moments when you do not want an agent to jump straight into implementation, lose the thread halfway through, verify its own work, or quietly skip important handoffs. You ask for the supervisor, the supervisor asks the right setup questions, turns the work into small units, creates worker dossiers, delegates scoped work to real worker agents when possible, verifies the results, and leaves a clear outcome trail.
+Example prompt:
 ```text
-Use workflow-supervisor to migrate the database from SQLite to LanceDB.
+Use $workflow-supervisor to build a FastAPI Naive RAG demo for a healthcare specialist agent.
 ```
-into a supervised workflow with intake, source grounding, bounded work units, concrete worker dossiers, independent verification, repair loops, and evidence-backed output.
-It currently supports certified automated worker delegation for **Codex** and **Claude Code**.
+The correct first response is not code. The correct first response is an intake packet. That is intentional.
 ![Workflow Supervisor hero image showing the supervisor coordinating source corpus, work units, dossiers, roles, loop policy, acceptance, repair, workflow docs, and final outputs](assets/workflow-supervisor-hero.png)
-## What It Is
-Workflow Supervisor is not another agent product. It is a thin coordination layer for agents that already exist.
-The supervisor is the visible agent in the conversation. It owns the plan, asks the user questions, creates work units, validates worker contracts, launches workers, reads reports, and decides what happens next.
-Workers are short-lived CLI runs:
-```bash
-workflow-supervisor delegate --agent codex --role implementer --unit U1 --dossier .workflow/dossiers/U1.yaml
-workflow-supervisor delegate --agent claude-code --role verifier --unit U1 --dossier .workflow/dossiers/U1.yaml
-```
-Each worker gets only the context it needs. It returns one structured report. The supervisor remains the coordinator.
-## The Moat
+## What You Get
-The moat is not a clever prompt. The moat is the set of gates that prevent agents from drifting.
+Workflow Supervisor gives you a repeatable workflow for serious agent tasks:
-Workflow Supervisor enforces:
+- a complete intake before work starts
+- a source map, even when the only source is the user prompt
+- a source-requirement coverage ledger so roadmap items and exit criteria cannot disappear
+- a `SPEC.md` review gate where humans can ask questions, request revisions, block, defer, or approve before work units are finalized
+- bounded work units, including `WU-001` for tiny tasks
+- dossiers that tell each worker exactly what to do and what not to touch
+- separate implementer, verifier, repair, and documenter responsibilities
+- structured worker reports instead of loose prose
+- evidence-based verification
+- repair loops that stay tied to failed acceptance rows
+- durable `.workflow/` state when the work needs to survive context loss
+- a final report with checks, risks, workers, and next actions
-- complete intake before work starts
-- no keyword-based skipping
-- bounded work units before implementation
-- machine-checkable `DossierV1` before workers start
-- role separation between implementer, verifier, repair, and documenter
-- normalized `WorkerReportV1` output from every worker
-- evidence required for PASS
-- automatic BLOCKED reports for vague dossiers, missing CLIs, auth failures, invalid output, timeouts, forbidden edits, or verifier mutations
+The main design choice is simple: the supervisor coordinates, workers do scoped work, and the CLI stays small.
-That means the system does not merely ask agents to behave. It blocks unsafe or vague execution before it happens.
+## The Mental Model
-## What It Solves
+Think of Workflow Supervisor as a project lead inside the current agent conversation.
-Large agent tasks fail in predictable ways:
+The supervisor:
-- the agent starts before asking enough questions
-- "autonomous" or "use workflow supervisor" gets treated as permission to act
-- the context window fills with unrelated history
-- handoffs are vague
-- implementers verify their own work
-- verifiers say "looks good" without evidence
-- repair work expands scope
-- progress disappears after context compaction
-- every platform has a different output style
+- asks the user for missing decisions
+- decides when work is ready to start
+- creates the plan, units, dossiers, and acceptance rows
+- hands work to role-specific workers
+- reads worker reports
+- routes blockers and repairs
+- decides when verification is good enough
+- applies the final disposition policy
-Workflow Supervisor solves this by making the workflow explicit and resumable.
+Workers:
-The conversation holds the supervisor. The `.workflow/` artifacts hold durable state. Workers get small, role-specific dossiers instead of the full conversation. Reports come back in one schema.
+- receive one scoped dossier
+- perform one role
+- return one structured report
+- do not talk to the human directly
+- do not choose final disposition
+- do not message each other
-## Architecture
+The CLI:
-Workflow Supervisor has two halves: a portable skill pack that teaches an agent how to supervise work, and a small npm CLI that installs those skills, validates contracts, and runs one-shot worker delegations.
+- installs the skills
+- validates skill and schema files
+- validates `DossierV1`
+- invokes one worker process
+- validates `WorkerReportV1`
+- returns a normalized report to the supervisor
-The current chat agent is always the supervisor. It owns intake, planning, source grounding, work-unit boundaries, dossiers, verification decisions, repair routing, and final disposition. The CLI never becomes a workflow daemon or queue. It is a helper that copies skills into supported agent directories, emits portable Markdown context, validates schema-backed artifacts, and invokes a single role-scoped worker process when delegation is authorized.
+It is not a daemon, queue, dashboard, scheduler, or full agent harness.
 ```mermaid
 flowchart TB
-  User["User"] --> Supervisor["Supervisor agent in current chat"]
-  Supervisor --> Skills["Installed skills: SKILL.md and agent metadata"]
-  Supervisor --> State["Durable state: .workflow/ in the target workspace"]
-  Supervisor --> CLI["workflow-supervisor CLI: bin/workflow-skills.mjs"]
-  subgraph Package["npm package"]
-    SkillsSource["skills/"]
-    AdapterDefs["adapters/"]
-    SchemaDefs["schemas/"]
-    Docs["docs/"]
-    CLI
-  end
-  CLI --> SkillsSource
-  CLI --> AdapterDefs
-  CLI --> SchemaDefs
-  CLI --> Docs
-  CLI --> Adapters["Adapter command array"]
-  Adapters --> Codex["Codex CLI worker"]
-  Adapters --> Claude["Claude Code CLI worker"]
-  Codex --> Report["WorkerReportV1 JSON"]
+  User["User"] --> Supervisor["Supervisor agent in the current chat"]
+  Supervisor --> Intake["Complete intake"]
+  Supervisor --> Sources["Source corpus"]
+  Supervisor --> Units["Work units"]
+  Supervisor --> Matrix["Acceptance matrix"]
+  Supervisor --> Dossiers["DossierV1 files"]
+  Supervisor --> CLI["workflow-supervisor CLI"]
+  CLI --> Codex["Codex worker"]
+  CLI --> Claude["Claude Code worker"]
+  Codex --> Report["WorkerReportV1"]
   Claude --> Report
   Report --> Supervisor
+  Supervisor --> Docs[".workflow/ state"]
+  Supervisor --> Outcome["Final report"]
 ```
-The package layout is intentionally simple:
+## What Happens When You Invoke It
-- `skills/` contains the opt-in supervisor skills and OpenAI metadata prompts.
-- `bin/workflow-skills.mjs` contains the installer, validator, context emitter, delegation wrapper, surface guard, and command dispatch.
-- `schemas/` defines `DossierV1` and `WorkerReportV1`.
-- `adapters/` defines certified Codex and Claude Code command arrays.
-- `docs/` explains CLI usage, portable delegation semantics, compatibility, artifacts, and troubleshooting.
-- `.workflow/` is created in consuming projects as private supervisor working memory, not as package state.
+When you explicitly invoke `workflow-supervisor`, `$workflow-supervisor`, or say to use the skill, the workflow enters `strict_full_workflow`.
-```mermaid
-flowchart LR
-  Package["workflow-supervisor package"] --> Install["install"]
-  Package --> Emit["emit-context"]
-  Package --> Validate["validate and validate-dossier"]
-  Package --> Delegate["delegate and delegate-doctor"]
-  Install --> CodexTarget["Codex target: ~/.agents/skills or project .agents/skills"]
-  Install --> ClaudeTarget["Claude target: ~/.claude/skills or project .claude/skills"]
-  Install --> Gitignore["Project .gitignore contains .workflow/"]
-  Emit --> PortableFile["Portable context file: AGENTS.md or CLAUDE.md"]
-  Validate --> SkillGate["Skill, schema, adapter, and dossier gates"]
-  Delegate --> WorkerRun["One role-scoped worker CLI process"]
-```
+Strict mode means task size does not matter. Even if the request is "make a function that adds two numbers", explicit supervisor invocation still means the full workflow:
-Delegation is a guarded subprocess, not an open-ended conversation between agents. The supervisor creates a concrete `DossierV1`, the CLI validates it before any worker starts, the adapter receives a role-scoped prompt and the `WorkerReportV1` schema, and the wrapper normalizes failure modes into structured `BLOCKED` reports.
+1. Ask the complete intake packet.
+2. Build or record the source corpus.
+3. Create a source-requirement coverage ledger.
+4. Create a `SPEC.md` review packet or file.
+5. Pause for human Q&A, revisions, block, defer, or approval when the path is human-in-loop.
+6. Create at least one work unit.
+7. Create acceptance rows that preserve source-scope fidelity.
+8. Create dossiers for the planned workers.
+9. Create a worker-agent plan.
+10. Ask for approval when the selected path is human-in-loop.
+11. Delegate scoped work to real workers when the environment supports it.
+12. Verify with evidence.
+13. Route repair work if verification fails.
+14. Refresh docs or outcome state.
+15. Report final status and next action.
-```mermaid
-sequenceDiagram
-  participant S as Supervisor
-  participant C as "workflow-supervisor delegate"
-  participant D as "DossierV1 validator"
-  participant G as "Surface guard"
-  participant A as "Agent adapter"
-  participant W as "Worker CLI"
-  S->>C: Role, unit ID, workspace, dossier path
-  C->>D: Parse JSON, YAML, or fenced YAML
-  D-->>C: Valid dossier or BLOCKED invalid_dossier
-  C->>G: Snapshot git status or explicit surfaces
-  C->>A: Build command from adapters/<agent>/adapter.json
-  A->>W: Run one CLI process with role prompt and schema
-  W-->>A: stdout, stderr, exit code, timeout signal
-  A-->>C: Raw worker output
-  C->>C: Extract and validate WorkerReportV1
-  C->>G: Compare after-state against allowed and forbidden surfaces
-  C-->>S: PASS, FAIL, or normalized BLOCKED WorkerReportV1
-```
+This rule exists to prevent the agent from deciding that a task is "too simple" and quietly skipping the supervisor.
-The supervisor loop is therefore stateful at the workflow level but stateless at the worker level. Every worker run is fresh, bounded by one dossier, and reduced back to one report before the supervisor decides the next step.
+## Intake
-```mermaid
-stateDiagram-v2
-  [*] --> Intake
-  Intake --> SourceGrounding: Complete intake
-  SourceGrounding --> WorkUnits: Sources ranked
-  WorkUnits --> AcceptanceMatrix: Units bounded
-  AcceptanceMatrix --> Dossier: Evidence rows ready
-  Dossier --> Delegation: DossierV1 valid
-  Delegation --> Verification: WorkerReportV1 returned
-  Verification --> Repair: FAIL or actionable BLOCKED
-  Repair --> Verification: Repair report returned
-  Verification --> Documentation: PASS with evidence
-  Documentation --> FinalDisposition: Outcome recorded
-  FinalDisposition --> [*]
-  Dossier --> Intake: Missing decision
-  Delegation --> Intake: Worker BLOCKED with human question
-```
+The supervisor must get explicit answers to these seven items before planning deeply, creating a goal, delegating workers, implementing, publishing, or taking irreversible action:
-## What It Is Used For
-Use it for work that is:
-- broad or ambiguous
-- multi-step
-- high-risk
-- likely to need repair loops
-- likely to exceed one context window
-- important enough to require independent verification
-- easier to handle as several bounded units
-Good examples:
+```text
+1. Objective and source: what artifact, spec, repo path, document, ticket, or source set controls the work?
+2. Execution path: autonomous_goal or human_in_loop?
+3. Mode: sequential, parallel where safe, or staged parallel?
+4. Delegation: automated worker delegation, native threads/subagents if available, or same-session phased?
+5. Final disposition: keep local, open PR, push main, deploy/publish, or ask at the end?
+6. Boundaries: may I install dependencies, call external services, use credentials, or only edit local files?
+7. State artifacts: create .workflow docs, use another artifact directory, or keep state inline?
+```
-- migrate SQLite storage to LanceDB
-- refactor authentication across several modules
-- update docs from a new API spec
-- implement a feature with tests and verification
-- review and repair a messy PR
-- produce durable workflow docs for a long-running task
+If any answer is missing or vague, the supervisor asks only for the missing pieces and stops. Phrases like "work autonomously", "just do it", or "use your judgment" do not fill in the missing intake fields.
-Do not use it for:
+Expected human pauses are normal. A workflow can move from `WAITING_FOR_HUMAN` back to `ACTIVE` after the user approves a plan or answers a blocker question.
-- tiny edits
-- one-off shell commands
-- obvious single-file changes
-- quick explanations
-- tasks where a normal agent turn is enough
+In `autonomous_goal`, a human clarification pause is not automatically a terminal failed goal. The supervisor records the blocker, asks the smallest needed question, updates SPEC/Q&A/coverage state when the answer arrives, refreshes only affected downstream artifacts, and resumes from the recorded next action. If an old Codex goal was already terminal-blocked, the resumed workflow references it as history and continues from workflow state or a newly authorized goal binding.
-## How It Works
+## The Workflow
-The lifecycle is:
+The full loop looks like this:
 ```text
-intake
--> source grounding
+complete intake
+-> source corpus
+-> source-requirement coverage ledger
+-> SPEC review and Q&A gate
 -> work units
+-> loop policy
 -> acceptance matrix
--> DossierV1
--> worker delegation
--> WorkerReportV1
+-> dossiers
+-> approval or autonomous path gate
+-> worker handoff
+-> worker report
 -> verification
 -> repair if needed
 -> re-verification
@@ -214,257 +153,398 @@ intake
 -> final disposition
 ```
-### 1. Intake
-The supervisor must ask the user for every required decision before it plans deeply or starts work:
+The worker lifecycle is tracked separately:
 ```text
-1. Objective and source
-2. Execution path: autonomous_goal or human_in_loop
-3. Mode: sequential, parallel where safe, or staged parallel
-4. Delegation: automated workers, native subagents if available, or same-session phased
-5. Final disposition: keep local, open PR, push, deploy, publish, or ask at end
-6. Boundaries: installs, network, credentials, destructive operations, forbidden surfaces
-7. State artifacts: .workflow docs, another directory, or inline state
+planned -> handed_off -> acknowledged -> reported -> verified -> closed
 ```
-If any answer is missing or vague, the supervisor asks again and stops.
+This makes it possible to see where the workflow is, which worker owns which piece, what evidence exists, and what should happen next.
+For source-of-truth builds, the coverage ledger is the guardrail against "green but incomplete" outcomes. Every material source requirement must be mapped to a work unit and acceptance row, explicitly deferred by the user, blocked as a scope decision, or marked non-material with a reason. Residual risks and future-work notes cannot contain unimplemented material source requirements in a PASS workflow.
+`SPEC.md` is the human review contract before final work units. In human-in-loop mode, the supervisor stops at the draft SPEC so the human can ask questions, request revisions, mark items deferred, block the workflow, or approve. The workflow continues only after explicit approval.
-### 2. Source Grounding
+When a workflow pauses for a human decision, the decision is recorded as state rather than treated as a restart. The next supervisor pass updates the affected coverage rows, SPEC fields, work units, acceptance rows, dossiers, or verification results, invalidates stale artifacts, and continues from the saved `Next Action`.
-The supervisor identifies the source of truth: files, specs, docs, tickets, user decisions, commands, or external constraints.
+## Skills In The Pack
-If source authority is unclear, the first work unit becomes discovery instead of implementation.
+The skill pack is made of small focused skills. The supervisor can use them as phase instructions.
-### 3. Work Units
+| Skill | What it does |
+|---|---|
+| `workflow-supervisor` | Coordinates the whole workflow, gates, workers, verification, repair, and final disposition. |
+| `source-corpus` | Lists and ranks sources, gaps, contradictions, authority, freshness, and allowed next action. |
+| `work-unit` | Turns the objective into bounded units with dependencies, surfaces, readiness, and done criteria. |
+| `loop-policy` | Defines execution path, mode, approval gates, repair limits, budgets, goal policy, and resume behavior. |
+| `acceptance-matrix` | Turns requirements into evidence rows with PASS, FAIL, BLOCKED, and waiver handling. |
+| `dossier-builder` | Creates concrete `DossierV1` contracts for workers. |
+| `worker-roles` | Defines role boundaries so implementers, verifiers, repair authors, and documenters do not blur together. |
+| `workflow-docs` | Creates or refreshes durable `.workflow/` artifacts when state needs to persist. |
-The supervisor splits the objective into bounded units.
+Loading a skill does not spawn a worker. A skill is instruction context for the current supervisor. A worker is a separate role-scoped execution run.
-For a SQLite to LanceDB migration, units might be:
+## Files The Workflow Creates
+Workflow state lives under `.workflow/` by default. The directory is local supervisor memory, not product code.
+In a Git-backed project, `.workflow/` must be in `.gitignore` before these files are written. Project installs do this automatically.
+Common workflow files:
+| File | Created from | Purpose |
+|---|---|---|
+| `.workflow/WORKFLOW.md` | `workflow-supervisor`, `loop-policy`, `workflow-docs` | Main state, objective, execution path, policy, stop gates, next action. |
+| `.workflow/SOURCE-CORPUS.md` | `source-corpus`, `workflow-docs` | Source ranking, missing sources, contradictions, assumptions. |
+| `.workflow/SPEC.md` | `workflow-supervisor`, `source-corpus`, `workflow-docs` | Human-reviewable interpretation, requirement coverage, Q&A, and approval decision before work units. |
+| `.workflow/WORK-UNITS.md` | `work-unit`, `workflow-docs` | Unit list, dependencies, sequencing, blocked units. |
+| `.workflow/DOSSIER.md` or `.workflow/dossiers/*.yaml` | `dossier-builder`, `workflow-docs` | Worker contracts for implementation, verification, repair, or documentation. |
+| `.workflow/WORKER-MAP.md` | `workflow-supervisor`, `worker-roles`, `workflow-docs` | Worker names, roles, transports, lifecycle, reports, blockers. |
+| `.workflow/ACCEPTANCE-MATRIX.md` | `acceptance-matrix`, `workflow-docs` | Evidence rows and material PASS, FAIL, BLOCKED states. |
+| `.workflow/VERIFICATION-REPORT.md` | verifier worker, `acceptance-matrix`, `workflow-docs` | Verification evidence, findings, skipped checks, residual risks. |
+| `.workflow/REPAIR-TICKETS.md` | repair worker, `workflow-docs` | Repair tasks tied to failed rows or verifier findings. |
+| `.workflow/DECISIONS.md` | supervisor, `workflow-docs` | User decisions, assumptions, reversals, unresolved questions. |
+| `.workflow/HANDOFF.md` | supervisor, `workflow-docs` | Resume pack for another agent or later session. |
+| `.workflow/OUTCOME.md` | supervisor, documenter worker, `workflow-docs` | Final status, checks, risks, disposition, next action. |
+| `.workflow/GOAL-STATE.md` | supervisor, `workflow-docs` | Codex goal mirror, blocked-goal history, human-decision resume checkpoint, and durable backup. |
+For documentation-heavy workflows, `workflow-docs` can also create:
 ```text
-U1 dependency and config
-U2 storage adapter
-U3 data migration path
-U4 tests and regression checks
-U5 docs and outcome report
+.workflow/DOCUMENTATION-BRIEF.md
+.workflow/CONTENT-INVENTORY.md
+.workflow/OUTLINE.md
+.workflow/CONTENT-DRAFT.md
+.workflow/CLAIMS-REGISTER.md
+.workflow/STYLE-GUIDE.md
+.workflow/GLOSSARY.md
+.workflow/ASSET-REGISTER.md
+.workflow/REVIEW-PLAN.md
+.workflow/REVISION-QUEUE.md
+.workflow/PUBLISHING-CHECKLIST.md
+.workflow/PUBLICATION-LOG.md
+.workflow/MAINTENANCE-PLAN.md
 ```
-### 4. DossierV1
+It should not create all of these by default. It should create the smallest useful set.
+## Dossiers
-Before any worker starts, the supervisor creates a concrete `DossierV1`.
+A dossier is the worker contract. It is how the supervisor prevents vague delegation.
-A dossier names:
+Before any worker starts, the supervisor creates a concrete `DossierV1` with:
-- the exact work unit
-- the worker role
+- workflow name
+- work unit
+- dossier id
+- worker name
+- worker role
+- delegation transport
+- start condition
+- objective and non-goals
+- source corpus and must-read sources
 - allowed surfaces
 - forbidden surfaces
-- must-read sources
 - acceptance rows
-- required evidence
 - adversarial checks
+- required commands or evidence
+- worker prompt
+- supervisor checkpoints
+- report schemas
 - stop gates
-- required report schema
+- assumptions
+- open questions
-Then the package validates it:
+Validate a dossier before delegation:
 ```bash
-workflow-supervisor validate-dossier .workflow/dossiers/U2-implementer.yaml --role implementer --unit U2 --json
+workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
 ```
-If the dossier says `all files`, `TBD`, `unknown`, `as needed`, or leaves open questions unresolved, it fails. No worker starts.
+The validator rejects things like `TBD`, `unknown`, `all files`, `entire repo`, unresolved open questions, role mismatches, unit mismatches, missing forbidden surfaces, and prompts that do not require `WorkerReportV1`.
-### 5. Worker Delegation
+## Workers
-The supervisor launches one role-scoped worker through Codex or Claude Code.
+The required worker responsibilities are:
-```bash
-workflow-supervisor delegate --agent codex --role implementer --unit U2 --dossier .workflow/dossiers/U2-implementer.yaml
-```
+| Responsibility | CLI role value | What it does |
+|---|---|---|
+| Implementer | `implementer` | Changes only the allowed surfaces named in the dossier. |
+| Verifier | `verifier` | Checks the work against acceptance rows and must not edit implementation. |
+| Repair author | `repair` | Converts failed rows or verifier findings into actionable repair work. |
+| Documenter | `documenter` | Updates workflow or outcome docs from evidence. |
+The skill text may say "repair-author" because that is the human role. The CLI schema uses `repair`.
+Workers receive only their scoped handoff:
+- role
+- dossier
+- sources
+- acceptance rows
+- stop gates
+- report schema
+They return one terminal `WorkerReportV1`.
+## Worker Reports
-The worker does not get the whole chat. It receives the dossier and report schema.
+Every delegated worker returns this machine-shaped report:
+```json
+{
+  "schema": "WorkerReportV1",
+  "status": "PASS",
+  "role": "verifier",
+  "unit_id": "WU-001",
+  "summary": "Verified the API responses and retrieval behavior against the acceptance rows.",
+  "changed_surfaces": [],
+  "evidence": ["pytest tests/test_api.py passed", "manual inspection of /health response"],
+  "checks_run": ["pytest tests/test_api.py"],
+  "skipped_checks": [],
+  "findings": [],
+  "blocking_question": null,
+  "next_action": "supervisor_review",
+  "adapter": null,
+  "guard": null,
+  "reason": null
+}
+```
-### 6. Verification And Repair
+The supervisor trusts the report shape, not loose prose. A PASS without evidence is invalid. A verifier that edits implementation is invalid. A worker that asks the human directly is converted into a blocker for the supervisor to route.
-A verifier checks the implementer's work against acceptance rows.
+## How The Supervisor Talks To Workers
-If verification fails, the supervisor creates repair work. Repairs must point back to verifier findings or acceptance rows. After repair, verification runs again.
+The portable worker path is one CLI command:
-### 7. Final Disposition
+```bash
+workflow-supervisor delegate \
+  --agent <codex|claude-code> \
+  --role <implementer|verifier|repair|documenter> \
+  --unit <unit-id> \
+  --cwd <workspace> \
+  --dossier <path>
+```
-The supervisor applies the final disposition chosen during intake:
+The command:
-- keep changes local
-- open a PR
-- push
-- deploy
-- publish
-- ask at the end
+1. Validates the dossier as `DossierV1`.
+2. Builds a scoped worker prompt.
+3. Starts the selected agent CLI with an adapter command array.
+4. Captures stdout, stderr, exit code, and timeout.
+5. Extracts and validates `WorkerReportV1`.
+6. Runs surface and role guards.
+7. Prints one normalized JSON report for the supervisor.
-No final irreversible action is inferred from vibes.
+Certified worker adapters:
-## What The User Sees
+- `codex`
+- `claude-code`
-The user sees the supervisor, not worker chatter.
+The `generic` target is for Markdown instruction export. It is not a certified automated worker adapter.
-In `human_in_loop`, the user sees:
+Check local adapter readiness:
-```text
-intake question
-approval packet
-progress summaries
-blocker questions if needed
-final report
+```bash
+workflow-supervisor delegate-doctor --agent all --probe --require-pass
 ```
-In `autonomous_goal`, the user sees:
+If a worker adapter is missing, unauthenticated, times out, returns invalid output, edits forbidden surfaces, or returns PASS without evidence, the delegate command returns a structured `BLOCKED` report.
+## No Silent Fallbacks
+If the environment can create, message, or delegate to worker agents, the supervisor must use real workers for implementation, verification, repair, and documentation responsibilities.
+If it cannot, it must record:
 ```text
-intake question
-execution plan
-periodic progress summaries
-blockers only when needed
-final report
+worker_agent_unavailable
 ```
-Workers do not ask the user questions directly. They return `BLOCKED` and the supervisor decides how to route it.
+Then it must stop for a human decision unless complete intake explicitly selected `same_session_phased`.
-## What The Output Is
+Same-session phased work is allowed only when selected. Verification in that mode is a `self-check`, not an `independent-verifier`.
-Workflow Supervisor produces three kinds of output.
+## Install
-### 1. Durable Workflow State
+Install from npm once published:
+```bash
+npm install -g workflow-supervisor
+workflow-supervisor validate
+```
-Usually under `.workflow/`:
+Use with `npx`:
-```text
-WORKFLOW.md
-SOURCE-CORPUS.md
-WORK-UNITS.md
-DOSSIER.md
-WORKER-MAP.md
-ACCEPTANCE-MATRIX.md
-VERIFICATION-REPORT.md
-REPAIR-TICKETS.md
-DECISIONS.md
-HANDOFF.md
-OUTCOME.md
-GOAL-STATE.md
+```bash
+npx workflow-supervisor list
 ```
-These files make the workflow resumable after context compaction or handoff.
+Install skills for Codex:
-### 2. Worker Reports
+```bash
+npx workflow-supervisor install --agent codex --scope user
+```
-Every worker returns `WorkerReportV1`:
+Install skills for Claude Code:
-```json
-{
-  "schema": "WorkerReportV1",
-  "status": "PASS",
-  "role": "verifier",
-  "unit_id": "U2",
-  "summary": "Verified LanceDB-backed search path.",
-  "changed_surfaces": [],
-  "evidence": ["pytest tests/test_search.py passed"],
-  "checks_run": ["pytest tests/test_search.py"],
-  "skipped_checks": [],
-  "findings": [],
-  "blocking_question": null,
-  "next_action": "supervisor_review",
-  "adapter": null,
-  "guard": null,
-  "reason": null
-}
+```bash
+npx workflow-supervisor install --agent claude-code --scope user
 ```
-### 3. Final Supervisor Report
+Install both certified targets into a project:
-The final report names:
+```bash
+npx workflow-supervisor install --agent all --scope project --project .
+```
-- execution path
-- goal status
-- sources used
-- work units completed
-- workers delegated
-- checks run
-- skipped checks
-- repairs performed
-- residual risks
-- final disposition
-- next action
+Project installs copy the skill folders into project-level agent directories and ensure the target project `.gitignore` contains:
-## How To Install
+```gitignore
+.workflow/
+```
 From a local checkout:
 ```bash
 git clone https://github.com/NikolaCehic/workflow-supervisor.git
 cd workflow-supervisor
+npm install
 npm run validate
 ```
-Install for Codex:
+## Basic Use
-```bash
-npx workflow-supervisor install --agent codex --scope user
+After installing the skills, ask your agent:
+```text
+Use $workflow-supervisor to implement a healthcare specialist FastAPI Naive RAG demo.
 ```
-Install for Claude Code:
+You should expect:
+1. The supervisor asks the complete intake packet.
+2. You answer every intake item.
+3. If the path is `human_in_loop`, the supervisor gives you an approval packet before implementation.
+4. The supervisor creates the source-requirement coverage ledger and `SPEC.md`.
+5. You ask questions, request revisions, block, defer, or approve the SPEC.
+6. After approval, the supervisor creates work units, acceptance rows, and dossiers.
+7. The supervisor delegates scoped work to workers when supported.
+8. Workers return structured reports.
+9. The supervisor verifies, routes repairs if needed, and gives you the final result.
+If you only want a normal quick edit, do not invoke `$workflow-supervisor`.
+## CLI Reference
+Common commands:
 ```bash
-npx workflow-supervisor install --agent claude-code --scope user
+workflow-supervisor list
+workflow-supervisor validate
+workflow-supervisor doctor --agent all
+workflow-supervisor install --agent codex --scope user
+workflow-supervisor install --agent claude-code --scope user
+workflow-supervisor install --agent all --scope project --project .
+workflow-supervisor emit-context --agent generic --out AGENTS.md
+workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
+workflow-supervisor delegate --agent codex --role implementer --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-implementer.yaml
+workflow-supervisor delegate --agent claude-code --role verifier --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-verifier.yaml
+workflow-supervisor delegate-doctor --agent all --probe --require-pass
 ```
-Install for both in a project:
+The package exposes two binary names:
-```bash
-npx workflow-supervisor install --agent all --scope project --project .
+```text
+workflow-supervisor
+workflow-skills
 ```
-Project installs also add `.workflow/` to the target project's `.gitignore`. Workflow state is local working memory by default; it should not be pushed with the consuming codebase unless the user explicitly chooses that.
+`workflow-skills` is kept as an alias. Prefer `workflow-supervisor` in user-facing instructions.
+## Codex, Claude Code, And Generic Targets
-Export generic Markdown instructions:
+Codex support uses:
+- `SKILL.md`
+- `agents/openai.yaml`
+- the `codex` CLI adapter for delegated workers
+Claude Code support uses:
+- the same `SKILL.md` folders
+- the `claude` CLI adapter for delegated workers
+- optional emitted context through `CLAUDE.md`
+The presence of `agents/openai.yaml` does not mean Claude Code is unsupported. It only means Codex has a specific metadata format.
+Generic support is for custom Markdown-reading agent setups:
 ```bash
-npx workflow-supervisor emit-context --agent generic --out AGENTS.md
+npx workflow-supervisor emit-context --agent generic --skills workflow-supervisor,workflow-docs --out AGENTS.md
 ```
-## How To Use
+Generic is not a certified worker delegation target.
-In Codex or Claude Code, ask explicitly:
+## Package Layout
 ```text
-Use $workflow-supervisor to migrate this repo from SQLite to LanceDB.
+skills/                  Skill instructions
+skills/*/agents/          Agent metadata, including Codex openai.yaml files
+schemas/                 DossierV1 and WorkerReportV1 schemas
+adapters/                Codex and Claude Code delegate command arrays
+docs/                    CLI, artifact, compatibility, and troubleshooting docs
+assets/                  README image assets
+bin/workflow-skills.mjs  Installer, validator, delegation wrapper, and command dispatch
 ```
-The correct next step is not implementation. The correct next step is the intake packet.
+The npm package includes:
+```text
+skills
+adapters
+schemas
+docs
+assets
+bin
+README.md
+LICENSE
+```
-After intake is complete, the supervisor proceeds through work units, dossiers, workers, verification, repair, and final disposition.
+## Publishing Checklist
-Useful CLI commands:
+Before publishing:
 ```bash
-workflow-supervisor validate
-workflow-supervisor validate-dossier .workflow/dossiers/U2-implementer.yaml --role implementer --unit U2 --json
-workflow-supervisor delegate --agent codex --role implementer --unit U2 --dossier .workflow/dossiers/U2-implementer.yaml
-workflow-supervisor delegate --agent claude-code --role verifier --unit U2 --dossier .workflow/dossiers/U2-verifier.yaml
-workflow-supervisor delegate-doctor --agent all --probe --require-pass
+npm run validate
+npm pack --dry-run
 ```
-## For Agents
+`npm run validate` checks skill structure, adapter metadata, schema artifacts, and the test suite.
+`npm pack --dry-run` shows exactly what will be included in the npm package.
+The package also has:
+- `prepublishOnly`: runs `npm run validate`
+- `engines.node`: `>=18`
+- `license`: `MIT`
+- `bin.workflow-supervisor`: `bin/workflow-skills`
+- `bin.workflow-skills`: `bin/workflow-skills`
+## Rules For Agents Reading This README
-If you are an AI agent reading this README, follow these rules:
+If you are an agent using this package:
 1. Do not start work before complete intake.
-2. Do not infer missing permissions from user intent.
-3. Do not delegate without a valid `DossierV1`.
-4. Do not accept worker prose as completion.
-5. Trust only `WorkerReportV1` reports.
-6. Treat verifier edits as invalid.
-7. Ask the user only through the supervisor.
-8. Before creating `.workflow/` in a codebase, ensure `.gitignore` contains `.workflow/`.
-9. Preserve enough `.workflow/` state for another agent to resume, but do not stage or publish it unless explicitly requested.
-The point is not to add ceremony. The point is to make complex agent work hard to derail.
+2. Do not infer missing permissions from words like "autonomous", "generate", or "work until done".
+3. If `$workflow-supervisor` is explicit, always create at least one work unit.
+4. Do not delegate without a valid `DossierV1`.
+5. Use separate worker agents when supported by the environment.
+6. Do not silently collapse worker agents into same-session roleplay.
+7. Treat same-session verification as `self-check`, not `independent-verifier`.
+8. Trust only structured `WorkerReportV1` results from delegated workers.
+9. Treat verifier edits as invalid.
+10. Keep `.workflow/` ignored and local unless the user explicitly asks to publish it.
+The promise is not magic autonomy. The promise is disciplined supervision: clear setup, bounded work, scoped workers, structured reports, evidence, repair, and a clean final handoff.