@kbediako/codex-orchestrator 0.1.37 → 0.1.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -39,6 +39,16 @@ Node.js >= 20 is required.
39
39
  > Tip: if you prefer `npx`, replace `codex-orch` with `npx @kbediako/codex-orchestrator`.
40
40
  > Tip: for multiple commands, you can also `export MCP_RUNNER_TASK_ID=<task-id>` once.
41
41
 
42
+ ## Runtime + Execution Modes
43
+
44
+ - Mode semantics are orthogonal:
45
+ - `executionMode=mcp|cloud` controls where stages execute.
46
+ - `runtimeMode=cli|appserver` controls local runtime provider selection.
47
+ - Local default runtime is `appserver`; preserve `--runtime-mode cli` as break-glass.
48
+ - `--execution-mode cloud --runtime-mode appserver` is intentionally unsupported and fails fast with actionable errors.
49
+ - `js_repl` is enabled by default globally. For deterministic cloud contracts, run explicit feature lanes (`CODEX_CLOUD_ENABLE_FEATURES=js_repl` and separate `CODEX_CLOUD_DISABLE_FEATURES=js_repl` runs). Use `CODEX_CLOUD_DISABLE_FEATURES=js_repl` for task-scoped cloud break-glass; reserve `codex features disable js_repl` for global emergency toggles and re-enable with `codex features enable js_repl`.
50
+ - `memories` remains scoped to explicit eval lanes (legacy alias `memory_tool` is compatibility-only).
51
+
42
52
  ## Downstream init (recommended)
43
53
 
44
54
  Use this when you want Codex to drive work inside another repo with the CO defaults.
@@ -97,6 +107,7 @@ codex -c 'mcp_servers.delegation.enabled=true' ...
97
107
  Codex built-ins are `default`, `explorer`, `worker`, and `awaiter`. `researcher` is user-defined.
98
108
  - `spawn_agent` defaults to `default` when `agent_type` is omitted, so always set `agent_type` explicitly.
99
109
  - Multi-turn loops are supported (`spawn_agent` -> `send_input` -> `wait`/`resume_agent` -> `close_agent`), so subagents can iterate before parent synthesis.
110
+ - Keep `fork_context` off by default for bounded subagent streams; set `fork_context=true` only when the subagent must inherit prior thread history.
100
111
 
101
112
  In Codex CLI `0.105.0`, built-in `explorer` no longer pins an older model profile; it inherits top-level defaults unless you attach a role `config_file`.
102
113
  CO now ships this downstream starter config via `init codex` (source template: `templates/codex/.codex/config.toml`; installed as .codex/config.toml in target repos):
@@ -155,7 +166,7 @@ Delegation guard profile:
155
166
 
156
167
  RLM (Recursive Language Model) is the long-horizon loop used by the `rlm` pipeline (`codex-orchestrator rlm "<goal>"` or `codex-orchestrator start rlm --goal "<goal>"`). Delegated runs only enter RLM when the child is launched with the `rlm` pipeline (or the rlm runner directly). In auto mode it resolves to symbolic only when context is large (`RLM_SYMBOLIC_MIN_BYTES`) and an explicit context signal is present (`RLM_CONTEXT_PATH` or delegated run); otherwise it stays iterative. The runner writes state to `.runs/<task-id>/cli/<run-id>/rlm/state.json` and stops when the validator passes or budgets are exhausted.
157
168
  For symbolic mode, the Option 2 alignment checker is enabled by default (`RLM_ALIGNMENT_CHECKER=1`) and writes append-only alignment artifacts under `.runs/<task-id>/cli/<run-id>/rlm/alignment/` (ledger + projection). Rollback toggle: set `RLM_ALIGNMENT_CHECKER=0`. Enforcement is opt-in via `RLM_ALIGNMENT_CHECKER_ENFORCE=1`.
158
- Symbolic subcalls can optionally use collab tools. Fast path: `codex-orchestrator rlm --multi-agent auto "<goal>"` (legacy alias: `--collab auto`; sets `RLM_SYMBOLIC_MULTI_AGENT=1` plus legacy `RLM_SYMBOLIC_COLLAB=1` for compatibility, and implies symbolic mode). Collab requires `multi_agent=true` in `codex features list` (`collab` remains a legacy alias). Collab tool calls parsed from `codex exec --json --enable multi_agent` are stored in `manifest.collab_tool_calls` (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, set to `0` to disable). For auditable role routing, prefix spawned prompts with `[agent_type:<role>]` and set `spawn_agent.agent_type` when supported; lifecycle validation enforces prompt-role evidence and validates `agent_type` when present (`RLM_SYMBOLIC_MULTI_AGENT_ROLE_POLICY=warn|off`, legacy alias `RLM_COLLAB_ROLE_POLICY`; `RLM_SYMBOLIC_MULTI_AGENT_ALLOW_DEFAULT_ROLE=1`, legacy alias `RLM_COLLAB_ALLOW_DEFAULT_ROLE`). `codex-orchestrator codex setup` remains available when you want a managed/pinned CLI path (opt-in via `CODEX_CLI_USE_MANAGED=1`).
169
+ Symbolic subcalls can optionally use collab tools. Fast path: `codex-orchestrator rlm --multi-agent auto "<goal>"` (legacy alias: `--collab auto`; sets `RLM_SYMBOLIC_MULTI_AGENT=1` plus legacy `RLM_SYMBOLIC_COLLAB=1` for compatibility, and implies symbolic mode). Collab requires `multi_agent=true` in `codex features list` (`collab` remains a legacy alias). Collab tool calls parsed from `codex exec --json --enable multi_agent` are stored in `manifest.collab_tool_calls` (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, set to `0` to disable); when present in events, `spawn_agent.fork_context` is captured for observability and surfaced in `codex-orchestrator doctor --usage` fork-context counters. For auditable role routing, prefix spawned prompts with `[agent_type:<role>]` and set `spawn_agent.agent_type` when supported; lifecycle validation enforces prompt-role evidence and validates `agent_type` when present (`RLM_SYMBOLIC_MULTI_AGENT_ROLE_POLICY=warn|off`, legacy alias `RLM_COLLAB_ROLE_POLICY`; `RLM_SYMBOLIC_MULTI_AGENT_ALLOW_DEFAULT_ROLE=1`, legacy alias `RLM_COLLAB_ALLOW_DEFAULT_ROLE`). `codex-orchestrator codex setup` remains available when you want a managed/pinned CLI path (opt-in via `CODEX_CLI_USE_MANAGED=1`).
159
170
  For batch fan-out jobs, prefer native `spawn_agents_on_csv` before building custom orchestration wrappers.
160
171
 
161
172
  ### Delegation flow
@@ -215,10 +226,12 @@ Options:
215
226
  - `--codex-home <path>` targets a different Codex home directory.
216
227
 
217
228
  Bundled skills (may vary by release):
229
+ - `codex-orchestrator`
218
230
  - `collab-subagents-first`
219
231
  - `chrome-devtools`
220
232
  - `delegation-usage`
221
233
  - `standalone-review`
234
+ - `elegance-review`
222
235
  - `docs-first`
223
236
  - `collab-evals`
224
237
  - `collab-deliberation`
@@ -276,7 +289,7 @@ codex-orchestrator doctor --cloud-preflight
276
289
  - Active PR watch-resolve-merge loop: `codex-orchestrator pr resolve-merge --pr <number> --quiet-minutes <window>` (add `--auto-merge` when approved; exits early when author action is required).
277
290
  - Passive PR monitor loop: `codex-orchestrator pr watch-merge --pr <number> --quiet-minutes <window>` (monitor-only behavior; keeps waiting unless terminal/timeout).
278
291
  - Review checkpoints (npm-only safe): `NOTES="Goal: ... | Summary: ... | Risks: ..." codex-orchestrator review --task <task-id>` for manifest-backed standalone review wrapper behavior (auto-skips repo-only diff-budget script when unavailable in downstream installs); use `codex review "<focus>"` for quick prompt-only checks; use `codex-orchestrator start implementation-gate --task <task-id> --format json` when you want a full gate run.
279
- - Downstream simulation before shipping wrapper/skill changes: `npm run pack:smoke` (packaged CLI in temp mock repo; validates `review` artifacts and `long-poll-wait` install path).
292
+ - Downstream simulation before shipping wrapper/skill changes: `npm run pack:smoke` (packaged CLI in temp mock repo; validates `review` artifacts and `long-poll-wait` install path; spot-check gate). Use `npm run pack:audit` for full tarball inventory validation.
280
293
  - Delegation: `codex-orchestrator doctor --apply --yes`, then enable for a Codex run with: `codex -c 'mcp_servers.delegation.enabled=true' ...`
281
294
  - Collab (symbolic RLM subagents): `codex-orchestrator rlm --multi-agent auto "<goal>"` (legacy alias: `--collab auto`; requires Codex `features.multi_agent=true`)
282
295
  - Cloud: set `CODEX_CLOUD_ENV_ID` (and optional `CODEX_CLOUD_BRANCH`), then run: `codex-orchestrator start <pipeline> --cloud --target <stage-id>`
@@ -31,6 +31,9 @@ export async function runDoctorUsage(options = {}) {
31
31
  const collabByEventType = {};
32
32
  const collabTools = new Map();
33
33
  const collabCaptureDisabled = String(process.env.CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS ?? '').trim() === '0';
34
+ let collabSpawnForkContextTrue = 0;
35
+ let collabSpawnForkContextFalse = 0;
36
+ let collabSpawnForkContextUnknown = 0;
34
37
  let collabRunsWithUnclosedSpawnAgents = 0;
35
38
  let collabUnclosedSpawnAgents = 0;
36
39
  let collabRunsWithSpawnThreadLimitFailures = 0;
@@ -160,6 +163,15 @@ export async function runDoctorUsage(options = {}) {
160
163
  continue;
161
164
  }
162
165
  if (tool === 'spawn_agent') {
166
+ if (entry?.fork_context === true) {
167
+ collabSpawnForkContextTrue += 1;
168
+ }
169
+ else if (entry?.fork_context === false) {
170
+ collabSpawnForkContextFalse += 1;
171
+ }
172
+ else {
173
+ collabSpawnForkContextUnknown += 1;
174
+ }
163
175
  if (isFailed) {
164
176
  const rawFailedSpawnId = typeof entry?.item_id === 'string' ? entry.item_id.trim() : '';
165
177
  const failedSpawnId = rawFailedSpawnId.length > 0 && rawFailedSpawnId !== 'unknown'
@@ -290,6 +302,9 @@ export async function runDoctorUsage(options = {}) {
290
302
  by_event_type: collabByEventType,
291
303
  top_tools: collabTopTools,
292
304
  capture_disabled: collabCaptureDisabled,
305
+ spawn_agent_fork_context_true: collabSpawnForkContextTrue,
306
+ spawn_agent_fork_context_false: collabSpawnForkContextFalse,
307
+ spawn_agent_fork_context_unknown: collabSpawnForkContextUnknown,
293
308
  runs_with_unclosed_spawn_agents: collabRunsWithUnclosedSpawnAgents,
294
309
  unclosed_spawn_agents: collabUnclosedSpawnAgents,
295
310
  runs_with_spawn_thread_limit_failures: collabRunsWithSpawnThreadLimitFailures,
@@ -357,9 +372,10 @@ export function formatDoctorUsageSummary(result) {
357
372
  const collabLifecycleUnknownSignal = collabLifecycleUnknownRuns > 0
358
373
  ? `, lifecycle_unknown_runs=${collabLifecycleUnknownRuns}`
359
374
  : '';
375
+ const collabForkContextSignal = `, fork_context=${result.collab.spawn_agent_fork_context_true}/${result.collab.spawn_agent_fork_context_false}/${result.collab.spawn_agent_fork_context_unknown}`;
360
376
  const collabToolList = formatTopList(result.collab.top_tools.map((entry) => ({ key: entry.tool, value: entry.calls })), 3, 'tools');
361
377
  lines.push(` - collab: ${result.collab.runs_with_tool_calls} (${formatPercent(result.collab.runs_with_tool_calls, result.runs.total)})${collabSuffix}`
362
- + `${collabTaskSuffix}, events=${result.collab.total_tool_calls}${collabAvg} (ok=${collabOk}, failed=${collabFailed}${collabLeakSignal}${collabThreadLimitSignal}${collabLifecycleUnknownSignal})${collabToolList}`);
378
+ + `${collabTaskSuffix}, events=${result.collab.total_tool_calls}${collabAvg} (ok=${collabOk}, failed=${collabFailed}${collabLeakSignal}${collabThreadLimitSignal}${collabLifecycleUnknownSignal}${collabForkContextSignal})${collabToolList}`);
363
379
  if (result.delegation.active_top_level_tasks > 0) {
364
380
  lines.push(` - delegation: ${result.delegation.active_with_subagents}/${result.delegation.active_top_level_tasks} top-level tasks have subagent manifests (${result.delegation.total_subagent_manifests} total); child_runs=${result.delegation.total_child_runs} over ${result.delegation.tasks_with_child_runs} tasks`);
365
381
  }
@@ -516,6 +516,7 @@ function parseCollabToolCallLine(line, stageId, commandIndex) {
516
516
  sender_thread_id: typeof item.sender_thread_id === 'string' ? item.sender_thread_id : 'unknown',
517
517
  receiver_thread_ids: receiverThreadIds,
518
518
  prompt: typeof item.prompt === 'string' ? item.prompt : null,
519
+ fork_context: typeof item.fork_context === 'boolean' ? item.fork_context : null,
519
520
  agents_states: item.agents_states && typeof item.agents_states === 'object'
520
521
  ? item.agents_states
521
522
  : null
package/docs/README.md CHANGED
@@ -26,6 +26,8 @@ Codex Orchestrator is the coordination layer that glues together Codex-driven ag
26
26
  ## How It Works
27
27
  - **Planner → Builder → Tester → Reviewer:** The core `TaskManager` (see `orchestrator/src/manager.ts`) wires together agent interfaces that decide *what* to run (planner), execute the selected pipeline stage (builder), verify results (tester), and give a final decision (reviewer).
28
28
  - **Execution modes:** Each plan item can flag `requires_cloud` and task metadata can set `execution.parallel`; the mode policy picks `mcp` (local MCP runtime) or `cloud` execution accordingly. Cloud runs perform a quick preflight (env id, codex availability, optional remote branch) and fall back to `mcp` with both summary text and a structured `cloud_fallback` manifest block when preflight fails.
29
+ - **Runtime provider modes:** `runtimeMode=cli|appserver` is orthogonal to `executionMode`; local default runtime is `appserver` with `cli` break-glass support preserved. Explicit `executionMode=cloud + runtimeMode=appserver` remains unsupported and fails fast.
30
+ - **Advanced feature posture:** `js_repl` is enabled by default globally (local + cloud lanes). For deterministic cloud contracts, pin explicit feature lanes (`CODEX_CLOUD_ENABLE_FEATURES=js_repl` and separate `CODEX_CLOUD_DISABLE_FEATURES=js_repl` runs). Use `CODEX_CLOUD_DISABLE_FEATURES=js_repl` for task-scoped cloud break-glass; reserve `codex features disable js_repl` for global emergency toggles and re-enable with `codex features enable js_repl`; `memories` remains scoped to explicit eval lanes (legacy alias `memory_tool` is compatibility-only).
29
31
  - **Event-driven persistence:** Milestones emit typed events on `EventBus`. `PersistenceCoordinator` captures run summaries in the task state store and writes manifests so nothing is lost if the process crashes.
30
32
  - **CLI lifecycle:** `CodexOrchestrator` (in `orchestrator/src/cli/orchestrator.ts`) resolves instruction sources (`AGENTS.md`, `docs/AGENTS.md`, `.agent/AGENTS.md`), loads the chosen pipeline, executes each command stage via `runCommandStage`, and keeps heartbeats plus command status current inside the manifest (approval evidence will surface once prompt wiring lands).
31
33
  - **Control-plane & scheduler integrations:** Optional validation (`control-plane/`) and scheduling (`scheduler/`) modules enrich manifests with drift checks, plan assignments, and remote run metadata.
@@ -102,6 +104,7 @@ Use `npx @kbediako/codex-orchestrator resume --run <run-id>` to continue interru
102
104
  - `codex-orchestrator mcp serve [--repo <path>] [--dry-run] [-- <extra args>]`: launch the MCP stdio server (delegates to `codex mcp-server`; stdout guard keeps protocol-only output, logs to stderr).
103
105
  - `codex-orchestrator init codex [--cwd <path>] [--force]`: copy starter templates into a repo (includes `mcp-client.json`, `AGENTS.md`, downstream .codex/config.toml + .codex/agents/* role files sourced from `templates/codex/.codex/*`, and `codex.orchestrator.json`; no overwrite unless `--force`).
104
106
  - `codex-orchestrator setup [--yes] [--refresh-skills]`: one-shot bootstrap for downstream users (installs bundled skills, configures delegation + DevTools wiring, and prints policy/usage guidance). By default, setup does not overwrite existing skills; add `--refresh-skills` when you want to replace existing bundled skill files.
107
+ - Canonical bundled skill roster lives in `README.md` ("Bundled skills" section) and shipped files under `skills/`.
105
108
  - `codex-orchestrator start [pipeline] [--auto-issue-log] [--repo-config-required]`: starts a pipeline run. `--auto-issue-log` writes failure bundles automatically (including setup failures before manifest creation); `--repo-config-required` disables packaged config fallback.
106
109
  - `codex-orchestrator flow [--task <task-id>] [--auto-issue-log] [--repo-config-required]`: runs `docs-review` then `implementation-gate` in sequence; stops on the first failure. `--auto-issue-log` writes failure bundles automatically (including setup failures before manifest creation); `--repo-config-required` disables packaged config fallback.
107
110
  - `codex-orchestrator doctor [--format json] [--usage] [--cloud-preflight] [--issue-log] [--apply]`: check optional tooling dependencies plus collab/cloud/delegation readiness and print enablement commands. `--usage` appends a local usage snapshot (scans `.runs/`) with adoption KPIs. `--issue-log` appends/creates `docs/codex-orchestrator-issues.md` (or `--issue-log-path`) and writes a JSON bundle under `out/<resolved-task>/doctor/issue-bundles/` with doctor context plus latest run context when available. `--apply` plans/applies quick fixes (use with `--yes`).
@@ -113,7 +116,7 @@ Use `npx @kbediako/codex-orchestrator resume --run <run-id>` to continue interru
113
116
 
114
117
  ## Publishing (npm)
115
118
  - Pack audit: `npm run pack:audit` (validates the tarball file list; run `npm run clean:dist && npm run build` first if `dist/` contains non-runtime artifacts).
116
- - Pack smoke: `npm run pack:smoke` (installs the tarball in a temp mock repo, runs CLI behavior checks including `review` artifacts and `long-poll-wait` skill install, and validates delegate-server JSONL; uses network).
119
+ - Pack smoke: `npm run pack:smoke` (installs the tarball in a temp mock repo, runs CLI behavior checks including `review` artifacts and `long-poll-wait` skill install, and validates delegate-server JSONL; uses network). Treat this as a spot-check gate; use `npm run pack:audit` for full tarball inventory validation.
117
120
  - Release tags: `vX.Y.Z` or `vX.Y.Z-alpha.N` must match `package.json` version.
118
121
  - Dist-tags: stable publishes to `latest`; alpha publishes to `alpha` and uses a GitHub prerelease.
119
122
  - Publishing auth: workflow attempts OIDC trusted publishing first (`id-token: write` + `--provenance`), then falls back to `secrets.NPM_TOKEN` when OIDC is unavailable. `secrets.NPM_TOKEN` must be an npm automation token (not a token that requires OTP).
@@ -192,7 +195,7 @@ Notes:
192
195
  - `TaskStateStore` writes per-task snapshots with bounded lock retries; failures degrade gracefully while still writing the main manifest.
193
196
  - `RunManifestWriter` generates the canonical manifest JSON for each run (mirrored under `.runs/`), while metrics appenders and summary writers keep `out/` up to date.
194
197
  - `run-summary.json` now carries `usageKpi` run-level signals (cloud/collab/delegation/rlm indicators) and `cloudFallback` details when a cloud request is downgraded to MCP.
195
- - `collab_tool_calls` in the manifest captures collab tool call JSONL lines extracted from command stdout (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, default 200; set 0 to disable capture). For `spawn_agent` calls, keep prompt-role intent explicit (first-line `[agent_type:<role>]`) and set `agent_type` when supported so routing remains auditable even when event payloads omit `agent_type`.
198
+ - `collab_tool_calls` in the manifest captures collab tool call JSONL lines extracted from command stdout (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, default 200; set 0 to disable capture). For `spawn_agent` calls, keep prompt-role intent explicit (first-line `[agent_type:<role>]`) and set `agent_type` when supported so routing remains auditable even when event payloads omit `agent_type`; keep `fork_context` disabled by default and enable it only for streams that require inherited thread history. When emitted upstream, `spawn_agent.fork_context` is persisted and summarized by `codex-orchestrator doctor --usage` counters (`true/false/unknown`) to support evidence-based policy decisions.
196
199
  - Heartbeat files and timestamps guard against stalled runs. `orchestrator/src/cli/metrics/metricsRecorder.ts` aggregates command durations, exit codes, and guardrail stats for later review.
197
200
  - Optional caps: `CODEX_ORCHESTRATOR_EXEC_EVENT_MAX_CHUNKS` limits captured exec chunk events per command (defaults to 500; set 0 for no cap), `CODEX_ORCHESTRATOR_TELEMETRY_MAX_EVENTS` caps in-memory telemetry events queued before flush (defaults to 1000; set 0 for no cap), and `CODEX_METRICS_PRIVACY_EVENTS_MAX` limits privacy decision events stored in `metrics.json` (-1 = no cap; `privacy_event_count` still reflects total).
198
201
 
@@ -212,11 +215,11 @@ Note: the commands below assume a source checkout; `scripts/` helpers are not in
212
215
  | `npm run eval:test` | Optional evaluation harness (enable when `evaluation/fixtures/**` is populated). |
213
216
  | `npm run docs:check` | Deterministically validates scripts/pipelines/paths referenced in agent-facing docs. |
214
217
  | `npm run docs:freshness` | Validates docs registry coverage + review recency; writes `out/<task-id>/docs-freshness.json`. |
215
- | `npm run ci:cloud-canary` | Runs the cloud canary harness (`scripts/cloud-canary-ci.mjs`) to verify cloud lifecycle manifest + run-summary evidence; credential-gated by `CODEX_CLOUD_ENV_ID` and optional auth secrets (`CODEX_CLOUD_BRANCH` defaults to `main`). Feature flags can be passed through with `CODEX_CLOUD_ENABLE_FEATURES` / `CODEX_CLOUD_DISABLE_FEATURES` (comma- or space-delimited, e.g. `sqlite,memory_tool`). |
218
+ | `npm run ci:cloud-canary` | Runs the cloud canary harness (`scripts/cloud-canary-ci.mjs`) to verify cloud lifecycle manifest + run-summary evidence; credential-gated by `CODEX_CLOUD_ENV_ID` and optional auth secrets (`CODEX_CLOUD_BRANCH` defaults to `main`). Feature flags can be passed through with `CODEX_CLOUD_ENABLE_FEATURES` / `CODEX_CLOUD_DISABLE_FEATURES` (comma- or space-delimited, e.g. `sqlite,memories`). |
216
219
  | `node scripts/delegation-guard.mjs` | Enforces subagent delegation evidence before review (repo-only). |
217
220
  | `node scripts/spec-guard.mjs --dry-run` | Validates spec freshness; required before review (repo-only). |
218
221
  | `node scripts/diff-budget.mjs` | Guards against oversized diffs before review (repo-only; defaults: 25 files / 800 lines; supports explicit overrides). |
219
- | `npm run pack:smoke` | Downstream simulation gate for npm consumers (tarball install in temp mock repo, `review` wrapper artifacts, delegate-server JSONL, and `skills install --only long-poll-wait`). Core lane runs it automatically when downstream-facing paths change, and `.github/workflows/pack-smoke-backstop.yml` runs a weekly `main` backstop. |
222
+ | `npm run pack:smoke` | Downstream simulation gate for npm consumers (tarball install in temp mock repo, `review` wrapper artifacts, delegate-server JSONL, and `skills install --only long-poll-wait`). Spot-check gate; pair with `npm run pack:audit` when you need full tarball inventory coverage. Core lane runs it automatically when downstream-facing paths change, and `.github/workflows/pack-smoke-backstop.yml` runs a weekly `main` backstop. |
220
223
  | `codex-orchestrator review` | Runs the standalone review wrapper with task-scoped manifest evidence; delegation MCP is enabled by default (explicit disable available via `CODEX_REVIEW_DISABLE_DELEGATION_MCP=1` / `--disable-delegation-mcp`), runtime guards are opt-in via `CODEX_REVIEW_*` env vars, and patience-first checkpoints log by default (`CODEX_REVIEW_MONITOR_INTERVAL_SECONDS` tunes/disables). Large uncommitted scopes get an automatic prompt advisory (`CODEX_REVIEW_LARGE_SCOPE_FILE_THRESHOLD` / `CODEX_REVIEW_LARGE_SCOPE_LINE_THRESHOLD`). Optional auto failure issue logging via `CODEX_REVIEW_AUTO_ISSUE_LOG=1` or `--auto-issue-log`. |
221
224
  | `npm run review` | Runs `codex review` with task-scoped manifest evidence; delegation MCP is enabled by default (explicit disable available via `CODEX_REVIEW_DISABLE_DELEGATION_MCP=1` / `--disable-delegation-mcp`), runtime guards are opt-in via `CODEX_REVIEW_*` env vars, and patience-first checkpoints log by default (`CODEX_REVIEW_MONITOR_INTERVAL_SECONDS` tunes/disables). Large uncommitted scopes get an automatic prompt advisory (`CODEX_REVIEW_LARGE_SCOPE_FILE_THRESHOLD` / `CODEX_REVIEW_LARGE_SCOPE_LINE_THRESHOLD`). Optional auto failure issue logging via `CODEX_REVIEW_AUTO_ISSUE_LOG=1` or `--auto-issue-log`. |
222
225
 
package/package.json CHANGED
@@ -1,10 +1,10 @@
1
1
  {
2
2
  "name": "@kbediako/codex-orchestrator",
3
- "version": "0.1.37",
3
+ "version": "0.1.38",
4
4
  "license": "MIT",
5
5
  "repository": {
6
6
  "type": "git",
7
- "url": "https://github.com/Kbediako/CO"
7
+ "url": "git+https://github.com/Kbediako/CO.git"
8
8
  },
9
9
  "homepage": "https://github.com/Kbediako/CO#readme",
10
10
  "bugs": {
@@ -51,6 +51,7 @@
51
51
  "docs:freshness": "node scripts/docs-freshness.mjs --check",
52
52
  "docs:sync": "node --loader ts-node/esm scripts/docs-hygiene.ts --sync",
53
53
  "ci:cloud-canary": "node scripts/cloud-canary-ci.mjs",
54
+ "canary:js-repl-usage": "node scripts/js-repl-usage-matrix.mjs",
54
55
  "canary:runtime": "node scripts/runtime-mode-canary.mjs",
55
56
  "prelint": "node scripts/build-patterns-if-needed.mjs",
56
57
  "lint": "eslint orchestrator/src orchestrator/tests packages/orchestrator/src packages/orchestrator/tests packages/shared adapters evaluation/harness evaluation/tests --ext .ts,.tsx",
@@ -126,5 +127,13 @@
126
127
  "ink": "^4.4.1",
127
128
  "js-yaml": "^4.1.0",
128
129
  "react": "^18.3.1"
129
- }
130
+ },
131
+ "description": "![Setup demo](docs/assets/setup.gif)",
132
+ "main": "dist/orchestrator/src/cli/orchestrator.js",
133
+ "directories": {
134
+ "doc": "docs",
135
+ "test": "tests"
136
+ },
137
+ "keywords": [],
138
+ "author": ""
130
139
  }
@@ -880,6 +880,7 @@
880
880
  "items": { "type": "string", "minLength": 1 }
881
881
  },
882
882
  "prompt": { "type": ["string", "null"] },
883
+ "fork_context": { "type": ["boolean", "null"] },
883
884
  "agents_states": {
884
885
  "type": ["object", "null"],
885
886
  "additionalProperties": true
@@ -36,5 +36,5 @@ Use this skill when you need browser-grounded evidence (UI screenshots, console
36
36
 
37
37
  - `standalone-review`: route ad-hoc review checks through a manifest-backed review loop when findings need auditability.
38
38
  - `collab-subagents-first`: isolate heavy browser exploration in a dedicated subagent stream to protect parent context.
39
- - `frontend-design-review`: use when the task emphasis is structured UI/UX critique with evidence-backed recommendations.
39
+ - `frontend-design-review`: optional global skill (not bundled in CO release); use when the task emphasis is structured UI/UX critique with evidence-backed recommendations.
40
40
  - `long-poll-wait`: monitor long-running browser-driven checks or CI replay loops to terminal state.
@@ -0,0 +1,83 @@
1
+ ---
2
+ name: codex-orchestrator
3
+ description: "Primary entrypoint for Codex Orchestrator usage: route tasks to the right pipeline, mode, and supporting skills with minimal, auditable steps."
4
+ ---
5
+
6
+ # Codex-Orchestrator Workflow Router
7
+
8
+ ## Overview
9
+
10
+ Use this skill as the default entrypoint for work in CO or downstream repos using `@kbediako/codex-orchestrator`. It routes intent to the smallest correct command path and points to specialized skills only when needed.
11
+
12
+ ## Core Contract
13
+
14
+ - Keep MCP as the control plane by default.
15
+ - Use docs-first before implementation edits.
16
+ - Use delegation early for non-trivial work with bounded stream ownership.
17
+ - Keep runtime and execution modes explicit and orthogonal:
18
+ - `runtimeMode=cli|appserver`
19
+ - `executionMode=mcp|cloud`
20
+
21
+ ## Default Command Path
22
+
23
+ For most task-scoped work:
24
+ - `codex-orchestrator flow --task <task-id>`
25
+ - `codex-orchestrator doctor --usage --window-days 30 --task <task-id>`
26
+ - `codex-orchestrator review --task <task-id>`
27
+
28
+ For explicit stage control:
29
+ - `codex-orchestrator start docs-review --task <task-id> --format json`
30
+ - `codex-orchestrator start implementation-gate --task <task-id> --format json`
31
+ - `codex-orchestrator status --run <run-id> --watch --interval 10`
32
+
33
+ ## Intent Router
34
+
35
+ 1) Task/spec scaffolding and mirror sync:
36
+ - Use `docs-first`.
37
+
38
+ 2) Delegation setup/run-control and subagent evidence discipline:
39
+ - Use `delegation-usage`.
40
+
41
+ 3) Stream decomposition across independent bounded work:
42
+ - Use `collab-subagents-first`.
43
+
44
+ 4) Option analysis, tradeoffs, and decision framing before implementation:
45
+ - Use `collab-deliberation`.
46
+
47
+ 5) Long-running checks/reviews/cloud jobs that need patience-first monitoring:
48
+ - Use `long-poll-wait`.
49
+
50
+ 6) Implementation checkpoint reviews and final handoff:
51
+ - Use `standalone-review`, then `elegance-review`.
52
+
53
+ 7) Release/tag/publish workflows:
54
+ - Use `release`.
55
+
56
+ 8) Collab/multi-agent scenario testing and evidence capture:
57
+ - Use `collab-evals`.
58
+
59
+ ## Feature Posture
60
+
61
+ - `js_repl` is default-on globally (local + cloud lanes); use explicit cloud feature lanes for deterministic contracts.
62
+ - `memories` stays scoped to explicit eval lanes.
63
+ - Subagent context forking (`fork_context`) is guidance-first: keep it `false` for bounded streams, and set `true` only when the child must inherit prior thread history.
64
+ - Compatibility note: upstream still accepts the legacy alias `memory_tool`; use `memories` in new CO guidance unless documenting legacy compatibility behavior.
65
+
66
+ ## Related Docs
67
+
68
+ - `AGENTS.md`
69
+ - `docs/AGENTS.md`
70
+ - `README.md`
71
+ - `docs/README.md`
72
+
73
+ ## Related Skills
74
+
75
+ - `docs-first`
76
+ - `delegation-usage`
77
+ - `collab-subagents-first`
78
+ - `collab-deliberation`
79
+ - `standalone-review`
80
+ - `elegance-review`
81
+ - `long-poll-wait`
82
+ - `release`
83
+ - `agent-first-adoption-steering`
@@ -102,6 +102,7 @@ Skip subagents when all conditions are true:
102
102
  - `message` (plain text), or
103
103
  - `items` (structured input).
104
104
  - Do not send both `message` and `items` in one spawn call.
105
+ - Keep `fork_context` disabled by default to preserve bounded context. Enable `fork_context=true` only when the subagent needs prior thread history that would otherwise be costly/risky to restate.
105
106
  - `spawn_agent` falls back to `default` when `agent_type` is omitted; always set `agent_type` explicitly.
106
107
  - Prefix spawned prompts with `[agent_type:<role>]` on line one so role intent is auditable from collab JSONL/manifests.
107
108
  - Use `items` when you need explicit structured context (for example `mention` paths like `app://...` or selected `skill` entries) instead of flattening everything into one long string.
@@ -26,6 +26,7 @@ Multi-agent (collab tools) mode is separate from delegation. For symbolic RLM su
26
26
  - Do not send both `message` and `items` in the same `spawn_agent` call.
27
27
  - `spawn_agent` falls back to `default` when `agent_type` is omitted; always set `agent_type` explicitly.
28
28
  - For auditable role routing, prefix spawned prompts with `[agent_type:<role>]` on the first line and keep it aligned with `agent_type`.
29
+ - Keep `fork_context` disabled by default for bounded streams; use `fork_context=true` only when the child must inherit prior thread context.
29
30
  - Spawn returns an `agent_id` (thread id). Current TUI collab rendering is id-based; do not depend on custom visible agent names.
30
31
  - Subagents spawned through collab run with approval effectively set to `never`; design child tasks to avoid approval/escalation requirements.
31
32
  - Collab spawn depth is bounded. Near/at max depth, recursive delegation can fail or collab can be disabled in children; prefer shallow parent fan-out.
@@ -1,2 +1,3 @@
1
1
  model = "gpt-5.3-codex-spark"
2
2
  model_reasoning_effort = "xhigh"
3
+ model_reasoning_summary = "none"