@kbediako/codex-orchestrator 0.1.8 → 0.1.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,264 +1,94 @@
1
1
  # Codex Orchestrator
2
2
 
3
- Codex Orchestrator is the coordination layer that glues together Codex-driven agents, run pipelines, approval policies, and evidence capture for multi-stage automation projects. It wraps a reusable orchestration core with a CLI that produces auditable manifests, integrates with control-plane validators, and syncs run results to downstream systems.
3
+ Codex Orchestrator is the CLI + runtime that coordinates Codex-driven runs, pipelines, and delegation MCP tooling. The npm release focuses on running pipelines locally, emitting auditable manifests, and hosting the delegation server.
4
4
 
5
- > **At a glance:** Every run starts from a task description, writes the active CLI manifest to `.runs/<task-id>/cli/<run-id>/manifest.json`, emits a persisted run summary at `.runs/<task-id>/<run-id>/manifest.json`, mirrors human-readable data to `out/<task-id>/`, and can optionally sync to a remote control plane. Pipelines define the concrete commands (build, lint, test, etc.) that execute for a given task.
5
+ ## Install
6
6
 
7
- ## How It Works
8
- - **Planner → Builder → Tester → Reviewer:** The core `TaskManager` (see `orchestrator/src/manager.ts`) wires together agent interfaces that decide *what* to run (planner), execute the selected pipeline stage (builder), verify results (tester), and give a final decision (reviewer).
9
- - **Execution modes:** Each plan item can flag `requires_cloud` and task metadata can set `execution.parallel`; the mode policy picks `mcp` (local MCP runtime) or `cloud` execution accordingly.
10
- - **Event-driven persistence:** Milestones emit typed events on `EventBus`. `PersistenceCoordinator` captures run summaries in the task state store and writes manifests so nothing is lost if the process crashes.
11
- - **CLI lifecycle:** `CodexOrchestrator` (in `orchestrator/src/cli/orchestrator.ts`) resolves instruction sources (`AGENTS.md`, `docs/AGENTS.md`, `.agent/AGENTS.md`), loads the chosen pipeline, executes each command stage via `runCommandStage`, and keeps heartbeats plus command status current inside the manifest (approval evidence will surface once prompt wiring lands).
12
- - **Control-plane & scheduler integrations:** Optional validation (`control-plane/`) and scheduling (`scheduler/`) modules enrich manifests with drift checks, plan assignments, and remote run metadata.
13
- - **Cloud sync (optional):** `orchestrator/src/sync/` includes a `CloudSyncWorker` + `CloudRunsClient`, but the default CLI does not wire cloud uploads yet—treat this as an integration point you enable explicitly.
14
- - **Tool orchestration:** The shared `packages/orchestrator` toolkit handles approval prompts, sandbox retries, and tool run bookkeeping used by higher-level agents.
15
-
16
- ```
17
- Task input ─► Planner ─► Mode policy (mcp/cloud) ─► Builder ─► Tester ─► Reviewer ─► Run summary
18
- │ │ │ │ │
19
- │ │ │ │ └─► Control-plane validators / Scheduler hooks / Cloud sync
20
- │ │ │ │
21
- └─► EventBus ─► PersistenceCoordinator ─► .runs/ manifests ─► out/ audits
22
-
23
- └─► Task state snapshots & guardrail evidence
24
-
25
- Group execution (when `FEATURE_TFGRPO_GROUP=on`): repeat the Builder → Tester → Reviewer stages for prioritized subtasks until a stage fails or the list completes.
26
- ```
27
-
28
- - **Mode policy:** Defaults to `mcp` but upgrades to `cloud` whenever a subtask flags `requires_cloud` or task metadata enables parallel execution, ensuring builders/testers run in the correct environment before artifacts are produced.
29
- - **Event-driven persistence:** Every `run:completed` event flows through `PersistenceCoordinator`, writing manifests under `.runs/<task-id>/<run-id>/` and keeping task-state snapshots current before downstream consumers (control-plane validators, scheduler hooks, optional cloud sync) ingest the data.
30
- - **Optional group loop:** When the TF-GRPO feature flag is on, the manager processes the prioritized subtask list serially, stopping early if any Builder or Tester stage fails so reviewers only see runnable work with passing prerequisites.
31
-
32
- ## Learning Pipeline (local snapshots + auto validation)
33
- - Enabled per run with `LEARNING_PIPELINE_ENABLED=1`; after a successful stage, the CLI captures the working tree (tracked + untracked, git-ignored files excluded) into `.runs/<task-id>/cli/<run-id>/learning/<run-id>.tar.gz` and copies it to `.runs/learning-snapshots/<task-id>/<run-id>.tar.gz` by default (recorded as `learning.snapshot.storage_path`).
34
- - Manifests record the tag, commit SHA, tarball digest/path, queue payload path, and validation status (`validated`, `snapshot_failed`, `stalled_snapshot`, `needs_manual_scenario`) under `learning.*` so reviewers can audit outcomes without external storage.
35
- - Scenario synthesis replays the most recent successful command from the run (or prompt/diff fallback), writes `learning/scenario.json`, and automatically executes the commands; validation logs live at `learning/scenario-validation.log` and are stored in `learning.validation.log_path`.
36
- - Override snapshot storage with `LEARNING_SNAPSHOT_DIR=/custom/dir` when needed; the default lives under `.runs/learning-snapshots/` (or `$CODEX_ORCHESTRATOR_RUNS_DIR/learning-snapshots/` when configured).
37
-
38
- ### How to run the learning pipeline locally
39
- - Seed a normal run and keep manifests grouped by task:
7
+ - Global install (recommended for CLI use):
40
8
  ```bash
41
- export MCP_RUNNER_TASK_ID=<task-id>
42
- LEARNING_PIPELINE_ENABLED=1 npx codex-orchestrator start diagnostics --format json
9
+ npm i -g @kbediako/codex-orchestrator
10
+ ```
11
+ - Or run via npx:
12
+ ```bash
13
+ npx codex-orchestrator --version
43
14
  ```
44
- - The learning section is written only when the run succeeds; rerun the command with `LEARNING_SNAPSHOT_DIR=<abs-path>` to redirect tarball copies.
45
15
 
46
- ## Repository Layout
47
- - `orchestrator/` – Core orchestration runtime (`TaskManager`, event bus, persistence, CLI, control-plane hooks, scheduler, privacy guard).
48
- - `packages/` – Shared libraries used by downstream projects (tool orchestrator, shared manifest schema, SDK shims, control-plane schema bundle).
49
- - `patterns/`, `eslint-plugin-patterns/` – Codemod + lint infrastructure invoked during builds.
50
- - `scripts/` – Operational helpers for repo contributors (e.g., `scripts/spec-guard.mjs`), not shipped in the npm package.
51
- - `tasks/`, `docs/`, `.agent/` – Project planning artifacts that must stay in sync (`[ ]` → `[x]` checklists pointing to manifest evidence).
52
- - `.runs/<task-id>/` – Per-task manifests, logs, metrics snapshots (`metrics.json`), and CLI run folders.
53
- - `out/<task-id>/` – Human-friendly summaries and (when enabled) cloud-sync audit logs.
16
+ Node.js >= 20 is required.
54
17
 
55
- ## CLI Quick Start
56
- 1. Install dependencies and build:
57
- ```bash
58
- npm install
59
- npm run build
60
- ```
61
- 2. Set the task context so artifacts land in the right folder:
18
+ ## Quick start
19
+
20
+ 1. Set a task id so artifacts land under `.runs/<task-id>/`:
62
21
  ```bash
63
22
  export MCP_RUNNER_TASK_ID=<task-id>
64
23
  ```
65
- 3. Launch diagnostics (defaults to the configured pipeline):
24
+ 2. Run a pipeline:
66
25
  ```bash
67
26
  npx codex-orchestrator start diagnostics --format json
68
27
  ```
69
- > Tip: keep `FEATURE_TFGRPO_GROUP`, `TFGRPO_GROUP_SIZE`, and related TF-GRPO env vars **unset** when running diagnostics. Many tests assume grouped execution is off, and the TF-GRPO guardrails require `groupSize >= 2` and `groupSize <= fanOutCapacity`. Use the `tfgrpo-learning` pipeline instead when you need grouped TF-GRPO runs.
70
- > HUD: add `--interactive` (or `--ui`) when stdout/stderr are TTY, TERM is not `dumb`, and CI is off to view the read-only Ink HUD. Non-interactive or JSON runs skip the HUD automatically.
71
- 4. Follow the run:
28
+ 3. Watch status:
72
29
  ```bash
73
30
  npx codex-orchestrator status --run <run-id> --watch --interval 10
74
31
  ```
75
- 5. Attach the CLI manifest path (`.runs/<task-id>/cli/<run-id>/manifest.json`) when you complete checklist items; the TaskManager summary lives at `.runs/<task-id>/<run-id>/manifest.json`, metrics aggregate in `.runs/<task-id>/metrics.json`, and summaries land in `out/<task-id>/state.json`.
76
-
77
- Use `npx codex-orchestrator resume --run <run-id>` to continue interrupted runs; the CLI verifies resume tokens, refreshes the plan, and updates the manifest safely before rerunning.
78
-
79
- ## Companion Package Commands
80
- - `codex-orchestrator mcp serve [--repo <path>] [--dry-run] [-- <extra args>]`: launch the MCP stdio server (delegates to `codex mcp-server`; stdout guard keeps protocol-only output, logs to stderr).
81
- - `codex-orchestrator init codex [--cwd <path>] [--force]`: copy starter templates into a repo (no overwrite unless `--force`).
82
- - `codex-orchestrator doctor [--format json]`: check optional tooling dependencies and print install commands.
83
- - `codex-orchestrator devtools setup [--yes]`: print DevTools MCP setup instructions (`--yes` applies `codex mcp add ...`).
84
- - `codex-orchestrator self-check --format json`: emit a safe JSON health payload for smoke tests.
85
- - `codex-orchestrator --version`: print the package version.
86
-
87
- ## Publishing (npm)
88
- - Pack audit: `npm run pack:audit` (validates the tarball file list; run `npm run clean:dist && npm run build` first if `dist/` contains non-runtime artifacts).
89
- - Pack smoke: `npm run pack:smoke` (installs the tarball in a temp dir and runs `--help`, `--version`, `self-check`; uses network).
90
- - Release tags: `vX.Y.Z` or `vX.Y.Z-alpha.N` must match `package.json` version.
91
- - Dist-tags: stable publishes to `latest`; alpha publishes to `alpha` and uses a GitHub prerelease.
92
- - Publishing auth: workflow exports `NODE_AUTH_TOKEN` from `secrets.NPM_TOKEN`; if set, publish without provenance; if not, the workflow relies on OIDC (`id-token: write`) and adds `--provenance`.
32
+ 4. Resume if needed:
33
+ ```bash
34
+ npx codex-orchestrator resume --run <run-id>
35
+ ```
93
36
 
94
- ## Parallel Runs (Meta-Orchestration)
95
- The orchestrator executes a single pipeline serially. “Parallelism” comes from running multiple orchestrator runs at the same time, ideally in separate git worktrees so builds/tests don’t contend for the same working tree outputs.
37
+ ## Delegation MCP server
96
38
 
97
- **Recommended pattern (one worktree per workstream)**
39
+ Run the delegation MCP server over stdio:
98
40
  ```bash
99
- git worktree add ../CO-stream-a HEAD
100
- git worktree add ../CO-stream-b HEAD
101
-
102
- # terminal A
103
- cd ../CO-stream-a
104
- export MCP_RUNNER_TASK_ID=<task-id>-a
105
- npx codex-orchestrator start diagnostics --format json
106
-
107
- # terminal B
108
- cd ../CO-stream-b
109
- export MCP_RUNNER_TASK_ID=<task-id>-b
110
- npx codex-orchestrator start diagnostics --format json
41
+ codex-orchestrator delegate-server --repo /path/to/repo --mode question_only
111
42
  ```
112
43
 
113
- Notes:
114
- - Use `--task <id>` instead of exporting `MCP_RUNNER_TASK_ID` when scripting runs.
115
- - Use `--parent-run <run-id>` to group related runs in manifests (optional).
116
- - If worktrees aren’t possible, isolate artifacts with `CODEX_ORCHESTRATOR_RUNS_DIR` and `CODEX_ORCHESTRATOR_OUT_DIR`. Use `CODEX_ORCHESTRATOR_ROOT` to point the CLI at a repo root when invoking from outside the repo (optional; defaults to the current working directory). Avoid concurrent builds/tests in the same checkout.
117
- - For a deeper runbook, see `.agent/SOPs/meta-orchestration.md`.
118
-
119
- ### Codex CLI prompts
120
- - Note: prompt installers and guardrail scripts live under `scripts/` and are repo-only (not included in the npm package).
121
- - The custom prompts live outside the repo at `~/.codex/prompts/diagnostics.md` and `~/.codex/prompts/review-handoff.md`. Recreate those files on every fresh machine so `/prompts:diagnostics` and `/prompts:review-handoff` are available in the Codex CLI palette.
122
- - These prompts are consumed by the Codex CLI UI only; the orchestrator does not read them. Keep updates synced across machines during onboarding.
123
- - To install or refresh the prompts (repo-only), run `scripts/setup-codex-prompts.sh` (use `--force` to overwrite existing files).
124
- - `/prompts:diagnostics` takes `TASK=<task-id> MANIFEST=<path> [NOTES=<free text>]`, exports `MCP_RUNNER_TASK_ID=$TASK`, runs `npx codex-orchestrator start diagnostics --format json`, tails `.runs/$TASK/cli/<run-id>/manifest.json` (or `npx codex-orchestrator status --watch`), and records evidence to `/tasks`, `docs/TASKS.md`, `.agent/task/...`, `.runs/$TASK/metrics.json`, and `out/$TASK/state.json` using `$MANIFEST`.
125
- - `/prompts:review-handoff` takes `TASK=<task-id> MANIFEST=<path> NOTES=<goal + summary + risks + optional questions>`, re-exports `MCP_RUNNER_TASK_ID`, and (repo-only) runs `node scripts/delegation-guard.mjs`, `node scripts/spec-guard.mjs --dry-run`, `npm run lint`, `npm run test`, optional `npm run eval:test`, plus `npm run review` (wraps `codex review` against the current diff and includes the latest run manifest path as evidence). It also reminds you to log approvals in `$MANIFEST` and mirror the evidence to the same docs/metrics/state targets.
126
- - In CI / `--no-interactive` pipelines (or when stdin is not a TTY), `npm run review` prints the review handoff prompt (including evidence paths) and exits successfully instead of invoking `codex review`. Set `FORCE_CODEX_REVIEW=1` to run `codex review` in those environments.
127
- - Always trigger diagnostics and review workflows through these prompts whenever you run the orchestrator so contributors consistently execute the required command sequences and capture auditable manifests.
128
-
129
- ### Identifier Guardrails
130
- - `MCP_RUNNER_TASK_ID` is no longer coerced or lowercased silently. The CLI calls the shared `sanitizeTaskId` helper and fails fast when the value contains control characters, traversal attempts, or Windows-reserved characters (`<`, `>`, `:`, `"`, `/`, `\`, `|`, `?`, `*`). Set the correct task ID in your environment *before* invoking the CLI.
131
- - Run IDs used for manifest or artifact storage must come from the CLI (or pass the shared `sanitizeRunId` helper). Strings with colons, control characters, or `../` are rejected to ensure every run directory lives under `.runs/<task-id>/cli/<run-id>` (and legacy `mcp` mirrors) without risking traversal.
132
-
133
- ### Delegation Guardrails
134
- - `delegate.question.poll` clamps `wait_ms` to `MAX_QUESTION_POLL_WAIT_MS` (10s); each poll timeout is bounded by the remaining `wait_ms`.
135
- - Confirm-to-act fallback only triggers on confirmation-specific errors (`error.code`), not generic tool failures.
136
- - Tool profile entries used for MCP overrides are sanitized; only alphanumeric + `_`/`-` names are allowed (rejects `;`, `/`, `\n`, `=` and similar).
137
-
138
- ## Pipelines & Execution Plans
139
- - Default pipelines live in `codex.orchestrator.json` (repository-specific) and `orchestrator/src/cli/pipelines/` (built-in defaults). Each stage is either a command (shell execution) or a nested pipeline.
140
- - The `CommandPlanner` inspects the selected pipeline and target stage; you can pass `--target <stage-id>` (alias: `--target-stage`) or set `CODEX_ORCHESTRATOR_TARGET_STAGE` to focus on a specific step (e.g., rerun tests only).
141
- - Stage execution records stdout/stderr logs, exit codes, optional summaries, and failure data directly into the manifest (`commands[]` array).
142
- - Guardrails (repo-only): before review, run `node scripts/delegation-guard.mjs` and `node scripts/spec-guard.mjs --dry-run` to ensure delegation and spec freshness; the orchestrator tracks guardrail outcomes in the manifest (`guardrail_status`).
143
-
144
- ## Approval & Sandbox Model
145
- - Approval policies (`never`, `on-request`, `auto`, or custom strings) flow through `packages/orchestrator`. Tool invocations can require approval before sandbox elevation, and all prompts/decisions are persisted.
146
- - Sandbox retries (for transient `mcp` or cloud failures) use exponential backoff with configurable classifiers, ensuring tools get multiple attempts without masking hard failures.
147
-
148
- ## Control Plane, Scheduler, and Cloud Sync
149
- - `control-plane/` builds versioned requests (`buildRunRequestV2`) and validates manifests against remote expectations. Drift reports are appended to run summaries so reviewers see deviations.
150
- - `scheduler/` resolves assignments, serializes plan data, and embeds scheduler state in manifests, making it easy to coordinate multi-stage work across agents.
151
- - `sync/` contains the cloud upload client + worker, but is not wired into the default CLI yet. Configure credentials through the credential broker (`orchestrator/src/credentials/`) and wire `createCloudSyncWorker` to an `EventBus` if you need uploads.
152
-
153
- ## Persistence & Observability
154
- - `TaskStateStore` writes per-task snapshots with bounded lock retries; failures degrade gracefully while still writing the main manifest.
155
- - `RunManifestWriter` generates the canonical manifest JSON for each run (mirrored under `.runs/`), while metrics appenders and summary writers keep `out/` up to date.
156
- - Heartbeat files and timestamps guard against stalled runs. `orchestrator/src/cli/metrics/metricsRecorder.ts` aggregates command durations, exit codes, and guardrail stats for later review.
157
- - Optional caps: `CODEX_ORCHESTRATOR_EXEC_EVENT_MAX_CHUNKS` limits captured exec chunk events per command (defaults to 500; set 0 for no cap), `CODEX_ORCHESTRATOR_TELEMETRY_MAX_EVENTS` caps in-memory telemetry events queued before flush (defaults to 1000; set 0 for no cap), and `CODEX_METRICS_PRIVACY_EVENTS_MAX` limits privacy decision events stored in `metrics.json` (-1 = no cap; `privacy_event_count` still reflects total).
158
-
159
- ## Customizing for New Projects
160
- - Duplicate the templates under `/tasks`, `docs/`, and `.agent/` for your task ID and keep checklist status mirrored (`[ ]` → `[x]`) with links to the manifest that proves each outcome.
161
- - Update `docs/PRD.md`, `docs/TECH_SPEC.md`, and `docs/ACTION_PLAN.md` with project details and evidence paths (`.runs/<task-id>/...`).
162
- - Refresh `.agent/` SOPs with task-specific guardrails, escalation contacts, and artifact locations.
163
- - Remove placeholder references in manifests/docs before merging so downstream teams see only live project data.
164
-
165
- ## Development Workflow
166
- Note: the commands below assume a source checkout; `scripts/` helpers are not included in the npm package.
167
- | Command | Purpose |
168
- | --- | --- |
169
- | `npm run build` | Compiles TypeScript to `dist/` (required for packaging and running the CLI from `dist/`). |
170
- | `npm run lint` | Lints orchestrator, adapters, shared packages. Auto-runs `node scripts/build-patterns-if-needed.mjs` so codemods compile when missing/outdated. |
171
- | `npm run test` | Vitest suite covering orchestration core, CLI services, and patterns. |
172
- | `npm run eval:test` | Optional evaluation harness (enable when `evaluation/fixtures/**` is populated). |
173
- | `npm run docs:check` | Deterministically validates scripts/pipelines/paths referenced in agent-facing docs. |
174
- | `npm run docs:freshness` | Validates docs registry coverage + review recency; writes `out/<task-id>/docs-freshness.json`. |
175
- | `node scripts/delegation-guard.mjs` | Enforces subagent delegation evidence before review (repo-only). |
176
- | `node scripts/spec-guard.mjs --dry-run` | Validates spec freshness; required before review (repo-only). |
177
- | `node scripts/diff-budget.mjs` | Guards against oversized diffs before review (repo-only; defaults: 25 files / 800 lines; supports explicit overrides). |
178
- | `npm run review` | Runs `codex review` with the latest run manifest path as evidence (repo-only; CI disables stdin; set `CODEX_REVIEW_NON_INTERACTIVE=1` to enforce locally). |
179
-
180
- Run `npm run build` to compile TypeScript before packaging or invoking the CLI directly from `dist/`.
181
-
182
- ## Diff Budget
183
-
184
- This repo enforces a small “diff budget” via `node scripts/diff-budget.mjs` to keep PRs reviewable and avoid accidental scope creep (repo-only).
185
-
186
- - Defaults: 25 changed files / 800 total lines changed (additions + deletions), excluding ignored paths.
187
- - CI: `.github/workflows/core-lane.yml` runs the diff budget on pull requests and sets `BASE_SHA` to the PR base commit.
188
- - Local: run `node scripts/diff-budget.mjs` before `npm run review` (the review wrapper runs it automatically).
189
-
190
- ### Local usage
191
- - Working tree diff against the default base (uses `BASE_SHA`/`origin/main`/initial commit as available): `node scripts/diff-budget.mjs`
192
- - Explicit base: `node scripts/diff-budget.mjs --base <ref>`
193
- - Commit-scoped mode (ignores working tree state): `node scripts/diff-budget.mjs --commit <sha>`
194
-
195
- ### Overrides (exceptional)
196
- - Local: `DIFF_BUDGET_OVERRIDE_REASON="..." node scripts/diff-budget.mjs`
197
- - CI: apply label `diff-budget-override` and add a PR body line `Diff budget override: <reason>` (label without a non-empty reason fails CI).
198
-
199
- ## Review Handoff
200
-
201
- Use an explicit handoff note for reviewers. `NOTES` is required for review runs; questions are optional:
202
-
203
- `NOTES="<goal + summary + risks + optional questions>" npm run review` (repo-only; CI disables stdin; set `CODEX_REVIEW_NON_INTERACTIVE=1` to enforce locally).
204
-
205
- Template: `Goal: ... | Summary: ... | Risks: ... | Questions (optional): ...`
44
+ Register it with Codex once, then enable per run:
45
+ ```bash
46
+ codex mcp add delegation -- codex-orchestrator delegation-server --repo /path/to/repo
47
+ codex -c 'mcp_servers.delegation.enabled=true' ...
48
+ ```
206
49
 
207
- To enable Chrome DevTools for review runs, set `CODEX_REVIEW_DEVTOOLS=1` (uses a codex config override; no repo scripts required).
208
- Default to the standard `implementation-gate` for general reviews; enable DevTools only when the review needs Chrome DevTools capabilities (visual/layout checks, network/perf diagnostics). After fixing review feedback, rerun the same gate and include any follow-up questions in `NOTES`.
209
- To run the full implementation gate with DevTools-enabled review, use `CODEX_REVIEW_DEVTOOLS=1 npx codex-orchestrator start implementation-gate --format json --no-interactive --task <task-id>`.
50
+ ## Skills (bundled)
210
51
 
211
- ## Frontend Testing
212
- Frontend testing is a first-class pipeline with DevTools off by default. The shipped pipelines already set `CODEX_NON_INTERACTIVE=1`; add it explicitly for custom automation or when you want the `frontend-test` shortcut to suppress Codex prompts:
213
- - `CODEX_NON_INTERACTIVE=1 npx codex-orchestrator start frontend-testing --format json --no-interactive --task <task-id>`
214
- - `CODEX_NON_INTERACTIVE=1 CODEX_REVIEW_DEVTOOLS=1 npx codex-orchestrator start frontend-testing --format json --no-interactive --task <task-id>` (DevTools enabled)
215
- - `CODEX_NON_INTERACTIVE=1 codex-orchestrator frontend-test` (shortcut; add `--devtools` to enable DevTools)
52
+ The release ships skills under `skills/`. Install them into `$CODEX_HOME/skills`:
53
+ ```bash
54
+ codex-orchestrator skills install
55
+ ```
216
56
 
217
- If you run the pipelines from this repo, run `npm run build` first so `dist/` stays current (the pipeline executes the compiled runner).
57
+ Options:
58
+ - `--force` overwrites existing files.
59
+ - `--codex-home <path>` targets a different Codex home directory.
218
60
 
219
- Note: the frontend-testing pipeline reads the shared `CODEX_REVIEW_DEVTOOLS` flag; prefer `--devtools` or `CODEX_REVIEW_DEVTOOLS=1` for explicit enablement.
61
+ Current bundled skills:
62
+ - `delegation-usage`
220
63
 
221
- Optional prompt overrides:
222
- - `CODEX_FRONTEND_TEST_PROMPT` (inline prompt)
223
- - `CODEX_FRONTEND_TEST_PROMPT_PATH` (path to a prompt file)
64
+ ## DevTools readiness
224
65
 
225
- `--no-interactive` disables the HUD only; set `CODEX_NON_INTERACTIVE=1` when you need to suppress Codex prompts (e.g., shortcut runs or custom automation).
66
+ Check DevTools readiness (skill + MCP config):
67
+ ```bash
68
+ codex-orchestrator doctor --format json
69
+ ```
226
70
 
227
- Check readiness with `codex-orchestrator doctor --format json` (reports DevTools skill + MCP config availability). Use `codex-orchestrator devtools setup` to print setup steps.
71
+ Print DevTools MCP setup guidance:
72
+ ```bash
73
+ codex-orchestrator devtools setup
74
+ ```
228
75
 
229
- ## Mirror Workflows
230
- - `npm run mirror:fetch -- --project <name> [--dry-run] [--force]`: reads `packages/<project>/mirror.config.json` (origin, routes, asset roots, rewrite/block/allow lists), caches downloads **per project** under `.runs/<task>/mirror/<project>/cache`, strips tracker patterns, rewrites externals to `/external/<host>/...`, localizes OG/twitter preview images, rewrites share links off tracker-heavy hosts, and stages into `.runs/<task>/mirror/<project>/<timestamp>/staging/public` before promoting to `packages/<project>/public`. Non-origin assets fall back to Web Archive when the primary host is down; promotion is skipped if errors are detected unless `--force` is set. Manifests live at `.runs/<task>/mirror/<project>/<timestamp>/manifest.json` (warns when `MCP_RUNNER_TASK_ID` is unset; honors `compliance/permit.json` when present).
231
- - `npm run mirror:serve -- --project <name> [--port <port>] [--csp <self|strict|off>] [--no-range]`: shared local-mirror server with traversal guard, HTML no-cache/asset immutability, optional CSP, optional Range support, and directory-listing blocks.
232
- - `npm run mirror:check -- --project <name> [--port <port>]`: boots a temporary mirror server when needed and verifies all configured routes with Playwright, failing on outbound hosts outside the allowlist, tracker strings (gtag/gtm/analytics/hotjar/facebook/clarity/etc.), unresolved assets, absolute https:// references, or non-200 responses. Keep this opt-in and trigger it when `packages/<project>/public` changes.
76
+ ## Common commands
233
77
 
234
- ## Hi-Fi Design Toolkit Captures
235
- Use the hi-fi pipeline to snapshot complex marketing sites (motion, interactions, tokens) while keeping the repo cloneable:
78
+ - `codex-orchestrator start <pipeline>` — run a pipeline.
79
+ - `codex-orchestrator plan <pipeline>` preview pipeline stages.
80
+ - `codex-orchestrator exec <cmd>` — run a one-off command with the exec runtime.
81
+ - `codex-orchestrator self-check --format json` — JSON health payload.
82
+ - `codex-orchestrator mcp serve` — Codex MCP stdio server.
236
83
 
237
- 1. **Configure the source:** Update `design.config.yaml` → `pipelines.hi_fi_design_toolkit.sources` with the target URL, slug, title, and breakpoints (the repo defaults to an empty `sources` list until you add one).
238
- 2. **Permit the domain:** Add (or update) the matching record in `compliance/permit.json` so Playwright, video capture, and live assets are explicitly approved for that origin.
239
- 3. **Prep tooling:**
240
- - `npm install && npm run build`
241
- - `npm run setup:design-tools` (installs design-system deps) and ensure FFmpeg is available (`brew install ffmpeg` on macOS).
242
- 4. **Run the pipeline:**
243
- ```bash
244
- export MCP_RUNNER_TASK_ID=<task-id>
245
- npx codex-orchestrator start hi-fi-design-toolkit --format json --task <task-id>
246
- ```
247
- Manifests/logs/state land under `.runs/<task-id>/cli/<run-id>/`, while staged artifacts land under `.runs/<task-id>/<run-id>/artifacts/design-toolkit/` with human summaries mirrored to `out/<task-id>/`.
248
- 5. **Validate the clone:** serve the staged reference directory, e.g.
249
- ```bash
250
- cd .runs/<task-id>/<run-id>/artifacts/design-toolkit/reference/<slug>
251
- python3 -m http.server 4173
252
- ```
253
- The build now mirrors all `/assets/...` content and adds root shortcuts (`wp-content`, `wp-includes`, etc.) so even absolute WordPress paths work offline. A lightweight `codex-scroll-fallback` script only unlocks scrolling if the captured page never enables it.
254
- 6. **Document learnings:** Drop run evidence into `docs/findings/<slug>.md` (see `docs/findings/ethical-life.md` for the latest example) so reviewers know which manifest, artifacts, and diffs back each finding.
84
+ ## What ships in the npm release
255
85
 
256
- ## Extending the Orchestrator
257
- - Add new agent strategies by implementing the planner/builder/tester/reviewer interfaces and wiring them into `TaskManager`.
258
- - Register additional pipelines or override defaults through `codex.orchestrator.json`. Nested pipelines let you compose reusable command groups.
259
- - Hook external systems by subscribing to `EventBus` events (plan/build/test/review/run) or by wiring optional integrations like `CloudSyncWorker`.
260
- - Leverage the shared TypeScript definitions in `packages/shared` to keep manifest, metrics, and telemetry consumers aligned.
86
+ - CLI + built-in pipelines
87
+ - Delegation MCP server (`delegate-server`)
88
+ - Bundled skills under `skills/`
89
+ - Schemas and templates needed by the CLI
261
90
 
262
- ---
91
+ ## Repository + contributor guide
263
92
 
264
- When preparing a review (repo-only), always capture the latest manifest path, run `node scripts/delegation-guard.mjs` and `node scripts/spec-guard.mjs --dry-run`, and ensure checklist mirrors (`/tasks`, `docs/`, `.agent/`) point at the evidence generated by Codex Orchestrator. That keeps the automation trustworthy and auditable across projects.
93
+ Repo internals, development workflows, and deeper architecture notes live here:
94
+ - `docs/README.md`
@@ -13,6 +13,7 @@ import { buildSelfCheckResult } from '../orchestrator/src/cli/selfCheck.js';
13
13
  import { initCodexTemplates, formatInitSummary } from '../orchestrator/src/cli/init.js';
14
14
  import { runDoctor, formatDoctorSummary } from '../orchestrator/src/cli/doctor.js';
15
15
  import { formatDevtoolsSetupSummary, runDevtoolsSetup } from '../orchestrator/src/cli/devtoolsSetup.js';
16
+ import { formatSkillsInstallSummary, installSkills } from '../orchestrator/src/cli/skills.js';
16
17
  import { loadPackageInfo } from '../orchestrator/src/cli/utils/packageInfo.js';
17
18
  import { slugify } from '../orchestrator/src/cli/utils/strings.js';
18
19
  import { serveMcp } from '../orchestrator/src/cli/mcp.js';
@@ -65,6 +66,9 @@ async function main() {
65
66
  case 'devtools':
66
67
  await handleDevtools(args);
67
68
  break;
69
+ case 'skills':
70
+ await handleSkills(args);
71
+ break;
68
72
  case 'mcp':
69
73
  await handleMcp(args);
70
74
  break;
@@ -499,6 +503,32 @@ async function handleDevtools(rawArgs) {
499
503
  console.log(line);
500
504
  }
501
505
  }
506
+ async function handleSkills(rawArgs) {
507
+ const { positionals, flags } = parseArgs(rawArgs);
508
+ const subcommand = positionals[0];
509
+ const wantsHelp = flags['help'] === true || subcommand === 'help' || subcommand === '--help';
510
+ if (!subcommand || wantsHelp) {
511
+ printSkillsHelp();
512
+ return;
513
+ }
514
+ switch (subcommand) {
515
+ case 'install': {
516
+ const format = flags['format'] === 'json' ? 'json' : 'text';
517
+ const force = flags['force'] === true;
518
+ const codexHome = readStringFlag(flags, 'codex-home');
519
+ const result = await installSkills({ force, codexHome });
520
+ if (format === 'json') {
521
+ console.log(JSON.stringify(result, null, 2));
522
+ }
523
+ else {
524
+ console.log(formatSkillsInstallSummary(result).join('\n'));
525
+ }
526
+ return;
527
+ }
528
+ default:
529
+ throw new Error(`Unknown skills command: ${subcommand}`);
530
+ }
531
+ }
502
532
  async function handleMcp(rawArgs) {
503
533
  const { positionals, flags } = parseArgs(rawArgs);
504
534
  const subcommand = positionals.shift();
@@ -720,6 +750,10 @@ Commands:
720
750
  devtools setup Print DevTools MCP setup instructions.
721
751
  --yes Apply setup by running "codex mcp add ...".
722
752
  --format json Emit machine-readable output (dry-run only).
753
+ skills install Install bundled skills into $CODEX_HOME/skills.
754
+ --force Overwrite existing skill files.
755
+ --codex-home <path> Override the target Codex home directory.
756
+ --format json Emit machine-readable output.
723
757
  mcp serve [--repo <path>] [--dry-run] [-- <extra args>]
724
758
  delegate-server Run the delegation MCP server (stdio).
725
759
  --repo <path> Repo root for config + manifests (default cwd).
@@ -735,3 +769,13 @@ function printVersion() {
735
769
  const pkg = loadPackageInfo();
736
770
  console.log(pkg.version ?? 'unknown');
737
771
  }
772
+ function printSkillsHelp() {
773
+ console.log(`Usage: codex-orchestrator skills <command> [options]
774
+
775
+ Commands:
776
+ install Install bundled skills into $CODEX_HOME/skills.
777
+ --force Overwrite existing skill files.
778
+ --codex-home <path> Override the target Codex home directory.
779
+ --format json Emit machine-readable output.
780
+ `);
781
+ }
@@ -0,0 +1,91 @@
1
+ import { existsSync } from 'node:fs';
2
+ import { copyFile, mkdir, readdir, stat } from 'node:fs/promises';
3
+ import { dirname, join, relative, resolve } from 'node:path';
4
+ import process from 'node:process';
5
+ import { resolveCodexHome } from './utils/devtools.js';
6
+ import { findPackageRoot } from './utils/packageInfo.js';
7
+ export async function installSkills(options = {}) {
8
+ const pkgRoot = findPackageRoot();
9
+ const sourceRoot = join(pkgRoot, 'skills');
10
+ await assertDirectory(sourceRoot);
11
+ const codexHome = resolveCodexHomePath(options.codexHome);
12
+ const targetRoot = join(codexHome, 'skills');
13
+ const written = [];
14
+ const skipped = [];
15
+ const skillNames = await listSkillNames(sourceRoot);
16
+ await copyDir(sourceRoot, targetRoot, {
17
+ force: options.force ?? false,
18
+ written,
19
+ skipped
20
+ });
21
+ return {
22
+ written,
23
+ skipped,
24
+ sourceRoot,
25
+ targetRoot,
26
+ skills: skillNames
27
+ };
28
+ }
29
+ export function formatSkillsInstallSummary(result, cwd = process.cwd()) {
30
+ const lines = [];
31
+ lines.push(`Skills source: ${result.sourceRoot}`);
32
+ lines.push(`Skills target: ${result.targetRoot}`);
33
+ if (result.skills.length > 0) {
34
+ lines.push(`Skills: ${result.skills.join(', ')}`);
35
+ }
36
+ if (result.written.length > 0) {
37
+ lines.push('Written:');
38
+ for (const filePath of result.written) {
39
+ lines.push(` - ${relative(cwd, filePath)}`);
40
+ }
41
+ }
42
+ if (result.skipped.length > 0) {
43
+ lines.push('Skipped (already exists):');
44
+ for (const filePath of result.skipped) {
45
+ lines.push(` - ${relative(cwd, filePath)}`);
46
+ }
47
+ }
48
+ if (result.written.length === 0 && result.skipped.length === 0) {
49
+ lines.push('No files written.');
50
+ }
51
+ return lines;
52
+ }
53
+ function resolveCodexHomePath(override) {
54
+ if (override && override.trim().length > 0) {
55
+ const trimmed = override.trim();
56
+ return resolve(process.cwd(), trimmed);
57
+ }
58
+ return resolveCodexHome(process.env);
59
+ }
60
+ async function listSkillNames(sourceRoot) {
61
+ const entries = await readdir(sourceRoot, { withFileTypes: true });
62
+ return entries.filter((entry) => entry.isDirectory()).map((entry) => entry.name);
63
+ }
64
+ async function assertDirectory(path) {
65
+ const info = await stat(path).catch(() => null);
66
+ if (!info || !info.isDirectory()) {
67
+ throw new Error(`Skills directory not found: ${path}`);
68
+ }
69
+ }
70
+ async function copyDir(sourceDir, targetDir, options) {
71
+ await mkdir(targetDir, { recursive: true });
72
+ const entries = await readdir(sourceDir, { withFileTypes: true });
73
+ for (const entry of entries) {
74
+ const sourcePath = join(sourceDir, entry.name);
75
+ const targetPath = join(targetDir, entry.name);
76
+ if (entry.isDirectory()) {
77
+ await copyDir(sourcePath, targetPath, options);
78
+ continue;
79
+ }
80
+ if (!entry.isFile()) {
81
+ continue;
82
+ }
83
+ if (existsSync(targetPath) && !options.force) {
84
+ options.skipped.push(targetPath);
85
+ continue;
86
+ }
87
+ await mkdir(dirname(targetPath), { recursive: true });
88
+ await copyFile(sourcePath, targetPath);
89
+ options.written.push(targetPath);
90
+ }
91
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kbediako/codex-orchestrator",
3
- "version": "0.1.8",
3
+ "version": "0.1.9",
4
4
  "license": "SEE LICENSE IN LICENSE",
5
5
  "type": "module",
6
6
  "bin": {