npm - workflow-supervisor - Versions diffs - 0.1.1 → 0.1.2 - Mend

workflow-supervisor 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md +380 -318
package/package.json +1 -1
package/skills/workflow-supervisor/SKILL.md +59 -9

package/README.md CHANGED Viewed

@@ -1,212 +1,142 @@
 # Workflow Supervisor
-Workflow Supervisor is a small skill pack and npm helper for making coding agents handle complex work with discipline.
+Workflow Supervisor is a strict supervision skill pack for agent work that needs to stay organized, resumable, and evidence-backed.
-It turns a vague request like:
+It is for moments when you do not want an agent to jump straight into implementation, lose the thread halfway through, verify its own work, or quietly skip important handoffs. You ask for the supervisor, the supervisor asks the right setup questions, turns the work into small units, creates worker dossiers, delegates scoped work to real worker agents when possible, verifies the results, and leaves a clear outcome trail.
+Example prompt:
 ```text
-Use workflow-supervisor to migrate the database from SQLite to LanceDB.
+Use $workflow-supervisor to build a FastAPI Naive RAG demo for a healthcare specialist agent.
 ```
-into a supervised workflow with intake, source grounding, bounded work units, concrete worker dossiers, independent verification, repair loops, and evidence-backed output.
-It currently supports certified automated worker delegation for **Codex** and **Claude Code**.
+The correct first response is not code. The correct first response is an intake packet. That is intentional.
 ![Workflow Supervisor hero image showing the supervisor coordinating source corpus, work units, dossiers, roles, loop policy, acceptance, repair, workflow docs, and final outputs](assets/workflow-supervisor-hero.png)
-## What It Is
-Workflow Supervisor is not another agent product. It is a thin coordination layer for agents that already exist.
-The supervisor is the visible agent in the conversation. It owns the plan, asks the user questions, creates work units, validates worker contracts, launches workers, reads reports, and decides what happens next.
-Workers are short-lived CLI runs:
-```bash
-workflow-supervisor delegate --agent codex --role implementer --unit U1 --dossier .workflow/dossiers/U1.yaml
-workflow-supervisor delegate --agent claude-code --role verifier --unit U1 --dossier .workflow/dossiers/U1.yaml
-```
+## What You Get
-Each worker gets only the context it needs. It returns one structured report. The supervisor remains the coordinator.
+Workflow Supervisor gives you a repeatable workflow for serious agent tasks:
-## The Moat
+- a complete intake before work starts
+- a source map, even when the only source is the user prompt
+- bounded work units, including `WU-001` for tiny tasks
+- dossiers that tell each worker exactly what to do and what not to touch
+- separate implementer, verifier, repair, and documenter responsibilities
+- structured worker reports instead of loose prose
+- evidence-based verification
+- repair loops that stay tied to failed acceptance rows
+- durable `.workflow/` state when the work needs to survive context loss
+- a final report with checks, risks, workers, and next actions
-The moat is not a clever prompt. The moat is the set of gates that prevent agents from drifting.
+The main design choice is simple: the supervisor coordinates, workers do scoped work, and the CLI stays small.
-Workflow Supervisor enforces:
+## The Mental Model
-- complete intake before work starts
-- no keyword-based skipping
-- bounded work units before implementation
-- machine-checkable `DossierV1` before workers start
-- role separation between implementer, verifier, repair, and documenter
-- normalized `WorkerReportV1` output from every worker
-- evidence required for PASS
-- automatic BLOCKED reports for vague dossiers, missing CLIs, auth failures, invalid output, timeouts, forbidden edits, or verifier mutations
+Think of Workflow Supervisor as a project lead inside the current agent conversation.
-That means the system does not merely ask agents to behave. It blocks unsafe or vague execution before it happens.
+The supervisor:
-## What It Solves
+- asks the user for missing decisions
+- decides when work is ready to start
+- creates the plan, units, dossiers, and acceptance rows
+- hands work to role-specific workers
+- reads worker reports
+- routes blockers and repairs
+- decides when verification is good enough
+- applies the final disposition policy
-Large agent tasks fail in predictable ways:
+Workers:
-- the agent starts before asking enough questions
-- "autonomous" or "use workflow supervisor" gets treated as permission to act
-- the context window fills with unrelated history
-- handoffs are vague
-- implementers verify their own work
-- verifiers say "looks good" without evidence
-- repair work expands scope
-- progress disappears after context compaction
-- every platform has a different output style
+- receive one scoped dossier
+- perform one role
+- return one structured report
+- do not talk to the human directly
+- do not choose final disposition
+- do not message each other
-Workflow Supervisor solves this by making the workflow explicit and resumable.
+The CLI:
-The conversation holds the supervisor. The `.workflow/` artifacts hold durable state. Workers get small, role-specific dossiers instead of the full conversation. Reports come back in one schema.
+- installs the skills
+- validates skill and schema files
+- validates `DossierV1`
+- invokes one worker process
+- validates `WorkerReportV1`
+- returns a normalized report to the supervisor
-## Architecture
-Workflow Supervisor has two halves: a portable skill pack that teaches an agent how to supervise work, and a small npm CLI that installs those skills, validates contracts, and runs one-shot worker delegations.
-The current chat agent is always the supervisor. It owns intake, planning, source grounding, work-unit boundaries, dossiers, verification decisions, repair routing, and final disposition. The CLI never becomes a workflow daemon or queue. It is a helper that copies skills into supported agent directories, emits portable Markdown context, validates schema-backed artifacts, and invokes a single role-scoped worker process when delegation is authorized.
+It is not a daemon, queue, dashboard, scheduler, or full agent harness.
 ```mermaid
 flowchart TB
-  User["User"] --> Supervisor["Supervisor agent in current chat"]
-  Supervisor --> Skills["Installed skills: SKILL.md and agent metadata"]
-  Supervisor --> State["Durable state: .workflow/ in the target workspace"]
-  Supervisor --> CLI["workflow-supervisor CLI: bin/workflow-skills.mjs"]
-  subgraph Package["npm package"]
-    SkillsSource["skills/"]
-    AdapterDefs["adapters/"]
-    SchemaDefs["schemas/"]
-    Docs["docs/"]
-    CLI
-  end
-  CLI --> SkillsSource
-  CLI --> AdapterDefs
-  CLI --> SchemaDefs
-  CLI --> Docs
-  CLI --> Adapters["Adapter command array"]
-  Adapters --> Codex["Codex CLI worker"]
-  Adapters --> Claude["Claude Code CLI worker"]
-  Codex --> Report["WorkerReportV1 JSON"]
+  User["User"] --> Supervisor["Supervisor agent in the current chat"]
+  Supervisor --> Intake["Complete intake"]
+  Supervisor --> Sources["Source corpus"]
+  Supervisor --> Units["Work units"]
+  Supervisor --> Matrix["Acceptance matrix"]
+  Supervisor --> Dossiers["DossierV1 files"]
+  Supervisor --> CLI["workflow-supervisor CLI"]
+  CLI --> Codex["Codex worker"]
+  CLI --> Claude["Claude Code worker"]
+  Codex --> Report["WorkerReportV1"]
   Claude --> Report
   Report --> Supervisor
+  Supervisor --> Docs[".workflow/ state"]
+  Supervisor --> Outcome["Final report"]
 ```
-The package layout is intentionally simple:
-- `skills/` contains the opt-in supervisor skills and OpenAI metadata prompts.
-- `bin/workflow-skills.mjs` contains the installer, validator, context emitter, delegation wrapper, surface guard, and command dispatch.
-- `schemas/` defines `DossierV1` and `WorkerReportV1`.
-- `adapters/` defines certified Codex and Claude Code command arrays.
-- `docs/` explains CLI usage, portable delegation semantics, compatibility, artifacts, and troubleshooting.
-- `.workflow/` is created in consuming projects as private supervisor working memory, not as package state.
-```mermaid
-flowchart LR
-  Package["workflow-supervisor package"] --> Install["install"]
-  Package --> Emit["emit-context"]
-  Package --> Validate["validate and validate-dossier"]
-  Package --> Delegate["delegate and delegate-doctor"]
-  Install --> CodexTarget["Codex target: ~/.agents/skills or project .agents/skills"]
-  Install --> ClaudeTarget["Claude target: ~/.claude/skills or project .claude/skills"]
-  Install --> Gitignore["Project .gitignore contains .workflow/"]
-  Emit --> PortableFile["Portable context file: AGENTS.md or CLAUDE.md"]
-  Validate --> SkillGate["Skill, schema, adapter, and dossier gates"]
-  Delegate --> WorkerRun["One role-scoped worker CLI process"]
-```
-Delegation is a guarded subprocess, not an open-ended conversation between agents. The supervisor creates a concrete `DossierV1`, the CLI validates it before any worker starts, the adapter receives a role-scoped prompt and the `WorkerReportV1` schema, and the wrapper normalizes failure modes into structured `BLOCKED` reports.
+## What Happens When You Invoke It
-```mermaid
-sequenceDiagram
-  participant S as Supervisor
-  participant C as "workflow-supervisor delegate"
-  participant D as "DossierV1 validator"
-  participant G as "Surface guard"
-  participant A as "Agent adapter"
-  participant W as "Worker CLI"
-  S->>C: Role, unit ID, workspace, dossier path
-  C->>D: Parse JSON, YAML, or fenced YAML
-  D-->>C: Valid dossier or BLOCKED invalid_dossier
-  C->>G: Snapshot git status or explicit surfaces
-  C->>A: Build command from adapters/<agent>/adapter.json
-  A->>W: Run one CLI process with role prompt and schema
-  W-->>A: stdout, stderr, exit code, timeout signal
-  A-->>C: Raw worker output
-  C->>C: Extract and validate WorkerReportV1
-  C->>G: Compare after-state against allowed and forbidden surfaces
-  C-->>S: PASS, FAIL, or normalized BLOCKED WorkerReportV1
-```
+When you explicitly invoke `workflow-supervisor`, `$workflow-supervisor`, or say to use the skill, the workflow enters `strict_full_workflow`.
-The supervisor loop is therefore stateful at the workflow level but stateless at the worker level. Every worker run is fresh, bounded by one dossier, and reduced back to one report before the supervisor decides the next step.
-```mermaid
-stateDiagram-v2
-  [*] --> Intake
-  Intake --> SourceGrounding: Complete intake
-  SourceGrounding --> WorkUnits: Sources ranked
-  WorkUnits --> AcceptanceMatrix: Units bounded
-  AcceptanceMatrix --> Dossier: Evidence rows ready
-  Dossier --> Delegation: DossierV1 valid
-  Delegation --> Verification: WorkerReportV1 returned
-  Verification --> Repair: FAIL or actionable BLOCKED
-  Repair --> Verification: Repair report returned
-  Verification --> Documentation: PASS with evidence
-  Documentation --> FinalDisposition: Outcome recorded
-  FinalDisposition --> [*]
-  Dossier --> Intake: Missing decision
-  Delegation --> Intake: Worker BLOCKED with human question
-```
+Strict mode means task size does not matter. Even if the request is "make a function that adds two numbers", explicit supervisor invocation still means the full workflow:
-## What It Is Used For
+1. Ask the complete intake packet.
+2. Build or record the source corpus.
+3. Create at least one work unit.
+4. Create acceptance rows.
+5. Create dossiers for the planned workers.
+6. Create a worker-agent plan.
+7. Ask for approval when the selected path is human-in-loop.
+8. Delegate scoped work to real workers when the environment supports it.
+9. Verify with evidence.
+10. Route repair work if verification fails.
+11. Refresh docs or outcome state.
+12. Report final status and next action.
-Use it for work that is:
+This rule exists to prevent the agent from deciding that a task is "too simple" and quietly skipping the supervisor.
-- broad or ambiguous
-- multi-step
-- high-risk
-- likely to need repair loops
-- likely to exceed one context window
-- important enough to require independent verification
-- easier to handle as several bounded units
+## Intake
-Good examples:
+The supervisor must get explicit answers to these seven items before planning deeply, creating a goal, delegating workers, implementing, publishing, or taking irreversible action:
-- migrate SQLite storage to LanceDB
-- refactor authentication across several modules
-- update docs from a new API spec
-- implement a feature with tests and verification
-- review and repair a messy PR
-- produce durable workflow docs for a long-running task
+```text
+1. Objective and source: what artifact, spec, repo path, document, ticket, or source set controls the work?
+2. Execution path: autonomous_goal or human_in_loop?
+3. Mode: sequential, parallel where safe, or staged parallel?
+4. Delegation: automated worker delegation, native threads/subagents if available, or same-session phased?
+5. Final disposition: keep local, open PR, push main, deploy/publish, or ask at the end?
+6. Boundaries: may I install dependencies, call external services, use credentials, or only edit local files?
+7. State artifacts: create .workflow docs, use another artifact directory, or keep state inline?
+```
-Do not use it for:
+If any answer is missing or vague, the supervisor asks only for the missing pieces and stops. Phrases like "work autonomously", "just do it", or "use your judgment" do not fill in the missing intake fields.
-- tiny edits
-- one-off shell commands
-- obvious single-file changes
-- quick explanations
-- tasks where a normal agent turn is enough
+Expected human pauses are normal. A workflow can move from `WAITING_FOR_HUMAN` back to `ACTIVE` after the user approves a plan or answers a blocker question.
-## How It Works
+## The Workflow
-The lifecycle is:
+The full loop looks like this:
 ```text
-intake
--> source grounding
+complete intake
+-> source corpus
 -> work units
+-> loop policy
 -> acceptance matrix
--> DossierV1
--> worker delegation
--> WorkerReportV1
+-> dossiers
+-> approval or autonomous path gate
+-> worker handoff
+-> worker report
 -> verification
 -> repair if needed
 -> re-verification
@@ -214,257 +144,389 @@ intake
 -> final disposition
 ```
-### 1. Intake
-The supervisor must ask the user for every required decision before it plans deeply or starts work:
+The worker lifecycle is tracked separately:
 ```text
-1. Objective and source
-2. Execution path: autonomous_goal or human_in_loop
-3. Mode: sequential, parallel where safe, or staged parallel
-4. Delegation: automated workers, native subagents if available, or same-session phased
-5. Final disposition: keep local, open PR, push, deploy, publish, or ask at end
-6. Boundaries: installs, network, credentials, destructive operations, forbidden surfaces
-7. State artifacts: .workflow docs, another directory, or inline state
+planned -> handed_off -> acknowledged -> reported -> verified -> closed
 ```
-If any answer is missing or vague, the supervisor asks again and stops.
+This makes it possible to see where the workflow is, which worker owns which piece, what evidence exists, and what should happen next.
-### 2. Source Grounding
+## Skills In The Pack
-The supervisor identifies the source of truth: files, specs, docs, tickets, user decisions, commands, or external constraints.
+The skill pack is made of small focused skills. The supervisor can use them as phase instructions.
-If source authority is unclear, the first work unit becomes discovery instead of implementation.
+| Skill | What it does |
+|---|---|
+| `workflow-supervisor` | Coordinates the whole workflow, gates, workers, verification, repair, and final disposition. |
+| `source-corpus` | Lists and ranks sources, gaps, contradictions, authority, freshness, and allowed next action. |
+| `work-unit` | Turns the objective into bounded units with dependencies, surfaces, readiness, and done criteria. |
+| `loop-policy` | Defines execution path, mode, approval gates, repair limits, budgets, goal policy, and resume behavior. |
+| `acceptance-matrix` | Turns requirements into evidence rows with PASS, FAIL, BLOCKED, and waiver handling. |
+| `dossier-builder` | Creates concrete `DossierV1` contracts for workers. |
+| `worker-roles` | Defines role boundaries so implementers, verifiers, repair authors, and documenters do not blur together. |
+| `workflow-docs` | Creates or refreshes durable `.workflow/` artifacts when state needs to persist. |
-### 3. Work Units
+Loading a skill does not spawn a worker. A skill is instruction context for the current supervisor. A worker is a separate role-scoped execution run.
-The supervisor splits the objective into bounded units.
+## Files The Workflow Creates
-For a SQLite to LanceDB migration, units might be:
+Workflow state lives under `.workflow/` by default. The directory is local supervisor memory, not product code.
+In a Git-backed project, `.workflow/` must be in `.gitignore` before these files are written. Project installs do this automatically.
+Common workflow files:
+| File | Created from | Purpose |
+|---|---|---|
+| `.workflow/WORKFLOW.md` | `workflow-supervisor`, `loop-policy`, `workflow-docs` | Main state, objective, execution path, policy, stop gates, next action. |
+| `.workflow/SOURCE-CORPUS.md` | `source-corpus`, `workflow-docs` | Source ranking, missing sources, contradictions, assumptions. |
+| `.workflow/WORK-UNITS.md` | `work-unit`, `workflow-docs` | Unit list, dependencies, sequencing, blocked units. |
+| `.workflow/DOSSIER.md` or `.workflow/dossiers/*.yaml` | `dossier-builder`, `workflow-docs` | Worker contracts for implementation, verification, repair, or documentation. |
+| `.workflow/WORKER-MAP.md` | `workflow-supervisor`, `worker-roles`, `workflow-docs` | Worker names, roles, transports, lifecycle, reports, blockers. |
+| `.workflow/ACCEPTANCE-MATRIX.md` | `acceptance-matrix`, `workflow-docs` | Evidence rows and material PASS, FAIL, BLOCKED states. |
+| `.workflow/VERIFICATION-REPORT.md` | verifier worker, `acceptance-matrix`, `workflow-docs` | Verification evidence, findings, skipped checks, residual risks. |
+| `.workflow/REPAIR-TICKETS.md` | repair worker, `workflow-docs` | Repair tasks tied to failed rows or verifier findings. |
+| `.workflow/DECISIONS.md` | supervisor, `workflow-docs` | User decisions, assumptions, reversals, unresolved questions. |
+| `.workflow/HANDOFF.md` | supervisor, `workflow-docs` | Resume pack for another agent or later session. |
+| `.workflow/OUTCOME.md` | supervisor, documenter worker, `workflow-docs` | Final status, checks, risks, disposition, next action. |
+| `.workflow/GOAL-STATE.md` | supervisor, `workflow-docs` | Codex goal mirror when goal state needs durable backup. |
+For documentation-heavy workflows, `workflow-docs` can also create:
 ```text
-U1 dependency and config
-U2 storage adapter
-U3 data migration path
-U4 tests and regression checks
-U5 docs and outcome report
+.workflow/DOCUMENTATION-BRIEF.md
+.workflow/CONTENT-INVENTORY.md
+.workflow/OUTLINE.md
+.workflow/CONTENT-DRAFT.md
+.workflow/CLAIMS-REGISTER.md
+.workflow/STYLE-GUIDE.md
+.workflow/GLOSSARY.md
+.workflow/ASSET-REGISTER.md
+.workflow/REVIEW-PLAN.md
+.workflow/REVISION-QUEUE.md
+.workflow/PUBLISHING-CHECKLIST.md
+.workflow/PUBLICATION-LOG.md
+.workflow/MAINTENANCE-PLAN.md
 ```
-### 4. DossierV1
+It should not create all of these by default. It should create the smallest useful set.
-Before any worker starts, the supervisor creates a concrete `DossierV1`.
+## Dossiers
-A dossier names:
+A dossier is the worker contract. It is how the supervisor prevents vague delegation.
-- the exact work unit
-- the worker role
+Before any worker starts, the supervisor creates a concrete `DossierV1` with:
+- workflow name
+- work unit
+- dossier id
+- worker name
+- worker role
+- delegation transport
+- start condition
+- objective and non-goals
+- source corpus and must-read sources
 - allowed surfaces
 - forbidden surfaces
-- must-read sources
 - acceptance rows
-- required evidence
 - adversarial checks
+- required commands or evidence
+- worker prompt
+- supervisor checkpoints
+- report schemas
 - stop gates
-- required report schema
+- assumptions
+- open questions
-Then the package validates it:
+Validate a dossier before delegation:
 ```bash
-workflow-supervisor validate-dossier .workflow/dossiers/U2-implementer.yaml --role implementer --unit U2 --json
+workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
 ```
-If the dossier says `all files`, `TBD`, `unknown`, `as needed`, or leaves open questions unresolved, it fails. No worker starts.
+The validator rejects things like `TBD`, `unknown`, `all files`, `entire repo`, unresolved open questions, role mismatches, unit mismatches, missing forbidden surfaces, and prompts that do not require `WorkerReportV1`.
-### 5. Worker Delegation
+## Workers
-The supervisor launches one role-scoped worker through Codex or Claude Code.
+The required worker responsibilities are:
-```bash
-workflow-supervisor delegate --agent codex --role implementer --unit U2 --dossier .workflow/dossiers/U2-implementer.yaml
-```
+| Responsibility | CLI role value | What it does |
+|---|---|---|
+| Implementer | `implementer` | Changes only the allowed surfaces named in the dossier. |
+| Verifier | `verifier` | Checks the work against acceptance rows and must not edit implementation. |
+| Repair author | `repair` | Converts failed rows or verifier findings into actionable repair work. |
+| Documenter | `documenter` | Updates workflow or outcome docs from evidence. |
-The worker does not get the whole chat. It receives the dossier and report schema.
+The skill text may say "repair-author" because that is the human role. The CLI schema uses `repair`.
-### 6. Verification And Repair
+Workers receive only their scoped handoff:
-A verifier checks the implementer's work against acceptance rows.
+- role
+- dossier
+- sources
+- acceptance rows
+- stop gates
+- report schema
+They return one terminal `WorkerReportV1`.
+## Worker Reports
+Every delegated worker returns this machine-shaped report:
+```json
+{
+  "schema": "WorkerReportV1",
+  "status": "PASS",
+  "role": "verifier",
+  "unit_id": "WU-001",
+  "summary": "Verified the API responses and retrieval behavior against the acceptance rows.",
+  "changed_surfaces": [],
+  "evidence": ["pytest tests/test_api.py passed", "manual inspection of /health response"],
+  "checks_run": ["pytest tests/test_api.py"],
+  "skipped_checks": [],
+  "findings": [],
+  "blocking_question": null,
+  "next_action": "supervisor_review",
+  "adapter": null,
+  "guard": null,
+  "reason": null
+}
+```
-If verification fails, the supervisor creates repair work. Repairs must point back to verifier findings or acceptance rows. After repair, verification runs again.
+The supervisor trusts the report shape, not loose prose. A PASS without evidence is invalid. A verifier that edits implementation is invalid. A worker that asks the human directly is converted into a blocker for the supervisor to route.
-### 7. Final Disposition
+## How The Supervisor Talks To Workers
-The supervisor applies the final disposition chosen during intake:
+The portable worker path is one CLI command:
-- keep changes local
-- open a PR
-- push
-- deploy
-- publish
-- ask at the end
+```bash
+workflow-supervisor delegate \
+  --agent <codex|claude-code> \
+  --role <implementer|verifier|repair|documenter> \
+  --unit <unit-id> \
+  --cwd <workspace> \
+  --dossier <path>
+```
-No final irreversible action is inferred from vibes.
+The command:
-## What The User Sees
+1. Validates the dossier as `DossierV1`.
+2. Builds a scoped worker prompt.
+3. Starts the selected agent CLI with an adapter command array.
+4. Captures stdout, stderr, exit code, and timeout.
+5. Extracts and validates `WorkerReportV1`.
+6. Runs surface and role guards.
+7. Prints one normalized JSON report for the supervisor.
-The user sees the supervisor, not worker chatter.
+Certified worker adapters:
-In `human_in_loop`, the user sees:
+- `codex`
+- `claude-code`
-```text
-intake question
-approval packet
-progress summaries
-blocker questions if needed
-final report
+The `generic` target is for Markdown instruction export. It is not a certified automated worker adapter.
+Check local adapter readiness:
+```bash
+workflow-supervisor delegate-doctor --agent all --probe --require-pass
 ```
-In `autonomous_goal`, the user sees:
+If a worker adapter is missing, unauthenticated, times out, returns invalid output, edits forbidden surfaces, or returns PASS without evidence, the delegate command returns a structured `BLOCKED` report.
+## No Silent Fallbacks
+If the environment can create, message, or delegate to worker agents, the supervisor must use real workers for implementation, verification, repair, and documentation responsibilities.
+If it cannot, it must record:
 ```text
-intake question
-execution plan
-periodic progress summaries
-blockers only when needed
-final report
+worker_agent_unavailable
 ```
-Workers do not ask the user questions directly. They return `BLOCKED` and the supervisor decides how to route it.
+Then it must stop for a human decision unless complete intake explicitly selected `same_session_phased`.
+Same-session phased work is allowed only when selected. Verification in that mode is a `self-check`, not an `independent-verifier`.
-## What The Output Is
+## Install
-Workflow Supervisor produces three kinds of output.
+Install from npm once published:
-### 1. Durable Workflow State
+```bash
+npm install -g workflow-supervisor
+workflow-supervisor validate
+```
-Usually under `.workflow/`:
+Use with `npx`:
-```text
-WORKFLOW.md
-SOURCE-CORPUS.md
-WORK-UNITS.md
-DOSSIER.md
-WORKER-MAP.md
-ACCEPTANCE-MATRIX.md
-VERIFICATION-REPORT.md
-REPAIR-TICKETS.md
-DECISIONS.md
-HANDOFF.md
-OUTCOME.md
-GOAL-STATE.md
+```bash
+npx workflow-supervisor list
 ```
-These files make the workflow resumable after context compaction or handoff.
+Install skills for Codex:
-### 2. Worker Reports
+```bash
+npx workflow-supervisor install --agent codex --scope user
+```
-Every worker returns `WorkerReportV1`:
+Install skills for Claude Code:
-```json
-{
-  "schema": "WorkerReportV1",
-  "status": "PASS",
-  "role": "verifier",
-  "unit_id": "U2",
-  "summary": "Verified LanceDB-backed search path.",
-  "changed_surfaces": [],
-  "evidence": ["pytest tests/test_search.py passed"],
-  "checks_run": ["pytest tests/test_search.py"],
-  "skipped_checks": [],
-  "findings": [],
-  "blocking_question": null,
-  "next_action": "supervisor_review",
-  "adapter": null,
-  "guard": null,
-  "reason": null
-}
+```bash
+npx workflow-supervisor install --agent claude-code --scope user
 ```
-### 3. Final Supervisor Report
+Install both certified targets into a project:
-The final report names:
+```bash
+npx workflow-supervisor install --agent all --scope project --project .
+```
-- execution path
-- goal status
-- sources used
-- work units completed
-- workers delegated
-- checks run
-- skipped checks
-- repairs performed
-- residual risks
-- final disposition
-- next action
+Project installs copy the skill folders into project-level agent directories and ensure the target project `.gitignore` contains:
-## How To Install
+```gitignore
+.workflow/
+```
 From a local checkout:
 ```bash
 git clone https://github.com/NikolaCehic/workflow-supervisor.git
 cd workflow-supervisor
+npm install
 npm run validate
 ```
-Install for Codex:
+## Basic Use
-```bash
-npx workflow-supervisor install --agent codex --scope user
+After installing the skills, ask your agent:
+```text
+Use $workflow-supervisor to implement a healthcare specialist FastAPI Naive RAG demo.
 ```
-Install for Claude Code:
+You should expect:
+1. The supervisor asks the complete intake packet.
+2. You answer every intake item.
+3. If the path is `human_in_loop`, the supervisor gives you an approval packet before implementation.
+4. The supervisor creates work units, acceptance rows, and dossiers.
+5. The supervisor delegates scoped work to workers when supported.
+6. Workers return structured reports.
+7. The supervisor verifies, routes repairs if needed, and gives you the final result.
+If you only want a normal quick edit, do not invoke `$workflow-supervisor`.
+## CLI Reference
+Common commands:
 ```bash
-npx workflow-supervisor install --agent claude-code --scope user
+workflow-supervisor list
+workflow-supervisor validate
+workflow-supervisor doctor --agent all
+workflow-supervisor install --agent codex --scope user
+workflow-supervisor install --agent claude-code --scope user
+workflow-supervisor install --agent all --scope project --project .
+workflow-supervisor emit-context --agent generic --out AGENTS.md
+workflow-supervisor validate-dossier .workflow/dossiers/WU-001-implementer.yaml --role implementer --unit WU-001 --json
+workflow-supervisor delegate --agent codex --role implementer --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-implementer.yaml
+workflow-supervisor delegate --agent claude-code --role verifier --unit WU-001 --cwd . --dossier .workflow/dossiers/WU-001-verifier.yaml
+workflow-supervisor delegate-doctor --agent all --probe --require-pass
 ```
-Install for both in a project:
+The package exposes two binary names:
-```bash
-npx workflow-supervisor install --agent all --scope project --project .
+```text
+workflow-supervisor
+workflow-skills
 ```
-Project installs also add `.workflow/` to the target project's `.gitignore`. Workflow state is local working memory by default; it should not be pushed with the consuming codebase unless the user explicitly chooses that.
+`workflow-skills` is kept as an alias. Prefer `workflow-supervisor` in user-facing instructions.
+## Codex, Claude Code, And Generic Targets
-Export generic Markdown instructions:
+Codex support uses:
+- `SKILL.md`
+- `agents/openai.yaml`
+- the `codex` CLI adapter for delegated workers
+Claude Code support uses:
+- the same `SKILL.md` folders
+- the `claude` CLI adapter for delegated workers
+- optional emitted context through `CLAUDE.md`
+The presence of `agents/openai.yaml` does not mean Claude Code is unsupported. It only means Codex has a specific metadata format.
+Generic support is for custom Markdown-reading agent setups:
 ```bash
-npx workflow-supervisor emit-context --agent generic --out AGENTS.md
+npx workflow-supervisor emit-context --agent generic --skills workflow-supervisor,workflow-docs --out AGENTS.md
 ```
-## How To Use
+Generic is not a certified worker delegation target.
-In Codex or Claude Code, ask explicitly:
+## Package Layout
 ```text
-Use $workflow-supervisor to migrate this repo from SQLite to LanceDB.
+skills/                  Skill instructions
+skills/*/agents/          Agent metadata, including Codex openai.yaml files
+schemas/                 DossierV1 and WorkerReportV1 schemas
+adapters/                Codex and Claude Code delegate command arrays
+docs/                    CLI, artifact, compatibility, and troubleshooting docs
+assets/                  README image assets
+bin/workflow-skills.mjs  Installer, validator, delegation wrapper, and command dispatch
 ```
-The correct next step is not implementation. The correct next step is the intake packet.
+The npm package includes:
+```text
+skills
+adapters
+schemas
+docs
+assets
+bin
+README.md
+LICENSE
+```
-After intake is complete, the supervisor proceeds through work units, dossiers, workers, verification, repair, and final disposition.
+## Publishing Checklist
-Useful CLI commands:
+Before publishing:
 ```bash
-workflow-supervisor validate
-workflow-supervisor validate-dossier .workflow/dossiers/U2-implementer.yaml --role implementer --unit U2 --json
-workflow-supervisor delegate --agent codex --role implementer --unit U2 --dossier .workflow/dossiers/U2-implementer.yaml
-workflow-supervisor delegate --agent claude-code --role verifier --unit U2 --dossier .workflow/dossiers/U2-verifier.yaml
-workflow-supervisor delegate-doctor --agent all --probe --require-pass
+npm run validate
+npm pack --dry-run
 ```
-## For Agents
+`npm run validate` checks skill structure, adapter metadata, schema artifacts, and the test suite.
+`npm pack --dry-run` shows exactly what will be included in the npm package.
+The package also has:
+- `prepublishOnly`: runs `npm run validate`
+- `engines.node`: `>=18`
+- `license`: `MIT`
+- `bin.workflow-supervisor`: `bin/workflow-skills`
+- `bin.workflow-skills`: `bin/workflow-skills`
+## Rules For Agents Reading This README
-If you are an AI agent reading this README, follow these rules:
+If you are an agent using this package:
 1. Do not start work before complete intake.
-2. Do not infer missing permissions from user intent.
-3. Do not delegate without a valid `DossierV1`.
-4. Do not accept worker prose as completion.
-5. Trust only `WorkerReportV1` reports.
-6. Treat verifier edits as invalid.
-7. Ask the user only through the supervisor.
-8. Before creating `.workflow/` in a codebase, ensure `.gitignore` contains `.workflow/`.
-9. Preserve enough `.workflow/` state for another agent to resume, but do not stage or publish it unless explicitly requested.
-The point is not to add ceremony. The point is to make complex agent work hard to derail.
+2. Do not infer missing permissions from words like "autonomous", "generate", or "work until done".
+3. If `$workflow-supervisor` is explicit, always create at least one work unit.
+4. Do not delegate without a valid `DossierV1`.
+5. Use separate worker agents when supported by the environment.
+6. Do not silently collapse worker agents into same-session roleplay.
+7. Treat same-session verification as `self-check`, not `independent-verifier`.
+8. Trust only structured `WorkerReportV1` results from delegated workers.
+9. Treat verifier edits as invalid.
+10. Keep `.workflow/` ignored and local unless the user explicitly asks to publish it.
+The promise is not magic autonomy. The promise is disciplined supervision: clear setup, bounded work, scoped workers, structured reports, evidence, repair, and a clean final handoff.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "workflow-supervisor",
-  "version": "0.1.1",
+  "version": "0.1.2",
   "description": "Portable workflow supervision skills for Codex, Claude Code, and generic agent workspaces.",
   "type": "module",
   "repository": {

package/skills/workflow-supervisor/SKILL.md CHANGED Viewed

@@ -1,11 +1,34 @@
 ---
 name: workflow-supervisor
-description: Coordinate open-ended, multi-step agent workflows when the user explicitly requests supervised or agent-loop coordination and at least one hard trigger is present, or when no explicit supervisor wording exists but two or more hard triggers are present. Hard triggers include multi-agent or worker delegation, durable resume need, high-risk independent verification, contradictory or missing sources, multi-unit scope, repair loops, approval gates, or workflow-state documentation. Do not use for simple single-turn answers, ordinary repo inspection, medium scoped edits, typo fixes, one-off tests, or narrowly scoped changes that can be completed directly.
+description: Coordinate supervised multi-agent workflows. Trigger whenever the user explicitly invokes workflow-supervisor, $workflow-supervisor, supervised workflow, dossiers, work units, worker agents, handoffs, approval gates, durable resume, or workflow-state documentation. When explicitly invoked, always run the full strict supervisor workflow regardless of task size; do not downscope, skip human approval or complete intake, skip dossiers, skip work units, skip worker-agent contracts, skip worker handoffs, or skip verification because a task appears simple. When not explicitly invoked, use only for workflows with hard supervisor triggers such as multi-agent handoff, durable resume, high-risk verification, contradictory or missing sources, multi-unit scope, repair loops, approval gates, or workflow-state documentation.
 ---
 # Workflow Supervisor
-Use this skill as the coordinating spine for complex work. The supervisor owns decomposition, delegation quality, loop discipline, stop gates, and outcome reporting. It may do source discovery and reporting itself, but implementation, verification, repair-ticket writing, and documentation should be treated as separate roles when an automated worker path is available. Native threads or subagents are optional transport optimizations, not the workflow contract.
+Use this skill as the coordinating spine for supervised multi-agent work. The supervisor owns decomposition, worker-agent handoff quality, loop discipline, stop gates, and outcome reporting. It may do source discovery and reporting itself, but implementation, verification, repair-ticket writing, and documentation must be treated as separate worker-agent responsibilities when an automated worker path is available. Native threads, subagents, or the portable delegate command are transports for those worker agents.
+## Strict Explicit Invocation Contract
+When the user explicitly invokes `workflow-supervisor`, `$workflow-supervisor`, or says to use this skill, the workflow is in `strict_full_workflow` mode. Task size is irrelevant. Do not decide that a request is too small, too easy, local-only, or single-file to receive the full workflow.
+Strict mode always requires:
+1. Complete intake before planning, goal creation, worker delegation, implementation, publication, or irreversible action.
+2. A human approval question before implementation unless completed intake explicitly selects `autonomous_goal`.
+3. A source corpus map, even if the source corpus is only "user prompt plus current workspace".
+4. At least one bounded work unit, even for a tiny change. Use `WU-001` when there is only one unit.
+5. A dossier for each implementation work unit before implementation begins.
+6. An acceptance matrix or acceptance draft with evidence expectations before implementation begins.
+7. A worker-agent plan with implementer, verifier, repair-author, and documenter agents.
+8. A worker lifecycle record using `planned -> handed_off -> acknowledged -> reported -> verified -> closed`.
+9. Verification labeled as `self-check`, `focused-check`, or `independent-verifier`.
+10. A final disposition question or recorded completed-intake final disposition after verification.
+Worker agents are mandatory when the environment provides worker, subagent, thread, or portable delegation tools. The supervisor must hand off implementation, verification, repair-authoring when needed, and documentation to separate agents with scoped dossiers and the required report schema. Run worker agents sequentially by default unless completed intake explicitly authorizes parallelism.
+If the environment cannot create, message, or delegate to worker agents, record `worker_agent_unavailable` and stop for the human decision unless completed intake explicitly selected `same_session_phased`. Do not silently collapse worker agents into same-session work.
+Do not nest supervisors recursively. A worker agent that receives a supervisor-scoped dossier must perform its assigned role instead of spawning another supervisor layer unless the parent supervisor explicitly asks for a child supervisor.
 ## Domain Neutrality
@@ -25,7 +48,7 @@ Use this lifecycle:
 4. If an active relevant goal exists, reuse it.
 5. If an active unrelated goal exists, do not create, reuse, complete, block, or update it. Ask the user whether to switch goals or continue with goal binding skipped.
 6. If no active goal exists and completed intake authorizes goal binding, call `create_goal` at most once with a concrete objective.
-7. Do not create a goal for simple single-turn answers, ordinary scoped edits, tiny tasks, incomplete intake, or when the user says not to.
+7. Do not create a goal for incomplete intake or when the user says not to.
 8. Keep the goal objective stable. Track tactical steps in the plan, dossier, workflow docs, or `.workflow/GOAL-STATE.md` rather than trying to rewrite the goal.
 9. Use `update_goal` only for terminal `complete` or `blocked` states when the environment supports that action.
 10. Mark the goal complete only after acceptance evidence supports completion and no required supervisor work remains.
@@ -41,12 +64,13 @@ If the environment has no goal tool or goal creation is not permitted, state the
 - Run the complete intake gate before goal creation, worker delegation, implementation, publication, or other irreversible action.
 - Do not infer execution path, mode, delegation, final disposition, or boundaries from keywords, action verbs, or intent guesses.
 - Classify the workflow as `autonomous_goal` or `human_in_loop` only from completed intake answers before delegating workers or beginning implementation.
+- Explicit invocation always requires complete intake, work units, dossiers, worker-agent contracts, scoped handoffs, report schema, and verification; do this even for trivial tasks.
 - Always produce a plan after complete intake. In `human_in_loop`, make it an approval packet and stop for approval. In `autonomous_goal`, make it an execution plan and continue only when the completed intake authorizes that path.
-- Do not begin implementation until complete intake and the path gate are satisfied, at least one concrete dossier exists, and no stop gate applies.
-- Delegate workers only through an automated supported delegation transport after complete intake and the path gate authorize delegation. If no supported transport exists, use same-session phased mode only when intake allowed it; otherwise stop as `delegation_unavailable`.
+- Do not begin implementation until complete intake and the path gate are satisfied, at least one work unit exists, at least one concrete dossier exists, worker-agent contracts exist, and no stop gate applies.
+- Delegate workers only through an automated supported delegation transport after complete intake and the path gate authorize delegation. If no supported transport exists, use same-session phased mode only when intake allowed it; otherwise stop as `worker_agent_unavailable`.
 - Do not start implementer, verifier, repair-author, or documenter workers before complete intake and the path gate are satisfied; role-specific start conditions are additional gates after that.
 - Keep roles separate: implementers implement, verifiers verify, repair authors write tickets, documenters update workflow artifacts, and the supervisor coordinates.
-- Treat same-session verification as a self-check, not independent verification.
+- Treat same-session verification as a self-check, not independent verification. Separate verifier-agent verification may be labeled `independent-verifier` only when genuinely performed by a separate worker agent or thread.
 - Prefer explicit PASS/FAIL/BLOCKED states over soft completion language.
 - Stop instead of improvising when sources are missing, contradictory, materially stale, or too vague to produce acceptance criteria.
 - Keep provenance optional; require enough outcome detail for another agent to resume.
@@ -64,7 +88,29 @@ Treat these as distinct mechanisms:
 - Native thread or subagent: an environment-specific transport a worker adapter may use when it is available and authorized.
 - Same-session phased mode: the current agent performs roles sequentially. Verification in this mode is a self-check, not independent verification.
-Start workers only after complete intake and the path gate are satisfied, a concrete dossier exists, the loop policy authorizes delegation, and the environment exposes an automated supported transport. If environment rules require explicit user approval for user-visible native thread creation, obtain it before using that transport. Do not use manual copy/paste handoff as the primary path. If automated delegation is unavailable, mark the unit `delegation_unavailable` unless completed intake explicitly selected same-session phased work.
+Start workers only after complete intake and the path gate are satisfied, at least one work unit exists, a concrete dossier exists, the loop policy authorizes delegation, and the environment exposes an automated supported transport. If environment rules require explicit user approval for user-visible native thread creation, obtain it before using that transport. Do not use manual copy/paste handoff as the primary path. If automated delegation is unavailable, mark the unit `worker_agent_unavailable` unless completed intake explicitly selected same-session phased work.
+## Worker Report Schema
+Every worker report back to the supervisor must use this schema:
+```text
+status: PASS | FAIL | BLOCKED | PARTIAL
+worker_id:
+role: implementer | verifier | repair-author | documenter
+work_unit_id:
+dossier_id:
+summary:
+changed_files:
+acceptance_evidence:
+checks_run:
+skipped_checks:
+blockers:
+residual_risks:
+next_recommended_action:
+```
+Implementers may edit only allowed surfaces from the dossier. Verifiers must not edit. Repair authors write repair tickets from failed acceptance rows and must not expand scope. Documenters update only approved workflow or documentation surfaces after source, implementation, verification, or repair evidence exists.
 ## Intake Gate
@@ -109,7 +155,7 @@ Negative example: "Using Workflow Supervisor, generate an API and create the pro
 2. Restate the objective, constraints, non-goals, known sources, and unknowns from the completed intake.
 3. Bind or reconcile the Codex goal only after complete intake and only when no unrelated active goal prevents binding.
 4. Build or request a source corpus map. Use `$source-corpus` when source authority, freshness, or contradictions matter.
-5. Split the objective into bounded work units. Use `$work-unit` for ambiguous or multi-phase goals.
+5. Split the objective into bounded work units. Use `$work-unit` for ambiguous or multi-phase goals. If the task is tiny, create exactly one work unit named `WU-001`.
 6. Choose a loop policy before starting work: sequential or parallel, retry limits, approval gates, budgets, goal update cadence, and blocker rules. Use `$loop-policy` when the policy is not obvious.
 7. Build dossiers for the first implementation units and any planned verification, repair, or documentation workers. Use `$dossier-builder` when delegating work to another agent or when the task has boundaries.
 8. Assign worker roles with explicit allowed and forbidden behavior. Use `$worker-roles` for multi-agent, native-thread, or portable-worker work.
@@ -184,7 +230,7 @@ workflow-supervisor validate-dossier <path> --role <role> --unit <unit-id> --jso
 If the dossier does not pass `DossierV1` validation, do not start the worker. Create a discovery dossier, ask for the missing decision, or mark the unit BLOCKED.
-Adapters may use native threads, native subagents, or one-shot CLI execution underneath, but the supervisor consumes only the normalized worker report. Use `workflow-supervisor delegate-doctor --agent <agent> --probe` to test the installed local adapter before relying on it for a workflow. If automated delegation is unavailable, mark execution as `delegation_unavailable` unless completed intake selected `same_session_phased`.
+Adapters may use native threads, native subagents, or one-shot CLI execution underneath, but the supervisor consumes only the normalized worker report. Use `workflow-supervisor delegate-doctor --agent <agent> --probe` to test the installed local adapter before relying on it for a workflow. If automated delegation is unavailable, mark execution as `worker_agent_unavailable` unless completed intake selected `same_session_phased`.
 Name workers deterministically from the workflow, unit, role, and dossier:
@@ -247,6 +293,7 @@ Stop when:
 - source authority cannot be established
 - sources contradict each other on a material requirement
 - the requested scope cannot fit into a bounded work unit
+- mandatory approval packet, work unit, dossier, worker-agent contract, or acceptance matrix is missing
 - allowed and forbidden surfaces cannot be named
 - acceptance cannot be verified with evidence
 - a verifier is asked to edit or an implementer is asked to self-approve
@@ -267,7 +314,10 @@ Report:
 - Objective handled
 - Sources used and gaps
 - Work units completed or remaining
+- Approval question id and whether `WAITING_FOR_HUMAN -> ACTIVE` occurred
+- Dossiers created or missing
 - Workers delegated, blocked, unavailable, or skipped
+- Worker lifecycle status for each role
 - Verification evidence
 - Repairs performed or recommended
 - Checks run and skipped