npm - @kbediako/codex-orchestrator - Versions diffs - 0.1.37 → 0.1.38 - Mend

@kbediako/codex-orchestrator 0.1.37 → 0.1.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +15 -2
package/dist/orchestrator/src/cli/doctorUsage.js +17 -1
package/dist/orchestrator/src/cli/services/commandRunner.js +1 -0
package/docs/README.md +7 -4
package/package.json +12 -3
package/schemas/manifest.json +1 -0
package/skills/chrome-devtools/SKILL.md +1 -1
package/skills/codex-orchestrator/SKILL.md +83 -0
package/skills/collab-subagents-first/SKILL.md +1 -0
package/skills/delegation-usage/SKILL.md +1 -0
package/templates/codex/.codex/agents/explorer-fast.toml +1 -0

package/README.md CHANGED Viewed

@@ -39,6 +39,16 @@ Node.js >= 20 is required.
    > Tip: if you prefer `npx`, replace `codex-orch` with `npx @kbediako/codex-orchestrator`.
    > Tip: for multiple commands, you can also `export MCP_RUNNER_TASK_ID=<task-id>` once.
+## Runtime + Execution Modes
+- Mode semantics are orthogonal:
+  - `executionMode=mcp|cloud` controls where stages execute.
+  - `runtimeMode=cli|appserver` controls local runtime provider selection.
+- Local default runtime is `appserver`; preserve `--runtime-mode cli` as break-glass.
+- `--execution-mode cloud --runtime-mode appserver` is intentionally unsupported and fails fast with actionable errors.
+- `js_repl` is enabled by default globally. For deterministic cloud contracts, run explicit feature lanes (`CODEX_CLOUD_ENABLE_FEATURES=js_repl` and separate `CODEX_CLOUD_DISABLE_FEATURES=js_repl` runs). Use `CODEX_CLOUD_DISABLE_FEATURES=js_repl` for task-scoped cloud break-glass; reserve `codex features disable js_repl` for global emergency toggles and re-enable with `codex features enable js_repl`.
+- `memories` remains scoped to explicit eval lanes (legacy alias `memory_tool` is compatibility-only).
 ## Downstream init (recommended)
 Use this when you want Codex to drive work inside another repo with the CO defaults.
@@ -97,6 +107,7 @@ codex -c 'mcp_servers.delegation.enabled=true' ...
 Codex built-ins are `default`, `explorer`, `worker`, and `awaiter`. `researcher` is user-defined.
 - `spawn_agent` defaults to `default` when `agent_type` is omitted, so always set `agent_type` explicitly.
 - Multi-turn loops are supported (`spawn_agent` -> `send_input` -> `wait`/`resume_agent` -> `close_agent`), so subagents can iterate before parent synthesis.
+- Keep `fork_context` off by default for bounded subagent streams; set `fork_context=true` only when the subagent must inherit prior thread history.
 In Codex CLI `0.105.0`, built-in `explorer` no longer pins an older model profile; it inherits top-level defaults unless you attach a role `config_file`.
 CO now ships this downstream starter config via `init codex` (source template: `templates/codex/.codex/config.toml`; installed as .codex/config.toml in target repos):
@@ -155,7 +166,7 @@ Delegation guard profile:
 RLM (Recursive Language Model) is the long-horizon loop used by the `rlm` pipeline (`codex-orchestrator rlm "<goal>"` or `codex-orchestrator start rlm --goal "<goal>"`). Delegated runs only enter RLM when the child is launched with the `rlm` pipeline (or the rlm runner directly). In auto mode it resolves to symbolic only when context is large (`RLM_SYMBOLIC_MIN_BYTES`) and an explicit context signal is present (`RLM_CONTEXT_PATH` or delegated run); otherwise it stays iterative. The runner writes state to `.runs/<task-id>/cli/<run-id>/rlm/state.json` and stops when the validator passes or budgets are exhausted.
 For symbolic mode, the Option 2 alignment checker is enabled by default (`RLM_ALIGNMENT_CHECKER=1`) and writes append-only alignment artifacts under `.runs/<task-id>/cli/<run-id>/rlm/alignment/` (ledger + projection). Rollback toggle: set `RLM_ALIGNMENT_CHECKER=0`. Enforcement is opt-in via `RLM_ALIGNMENT_CHECKER_ENFORCE=1`.
-Symbolic subcalls can optionally use collab tools. Fast path: `codex-orchestrator rlm --multi-agent auto "<goal>"` (legacy alias: `--collab auto`; sets `RLM_SYMBOLIC_MULTI_AGENT=1` plus legacy `RLM_SYMBOLIC_COLLAB=1` for compatibility, and implies symbolic mode). Collab requires `multi_agent=true` in `codex features list` (`collab` remains a legacy alias). Collab tool calls parsed from `codex exec --json --enable multi_agent` are stored in `manifest.collab_tool_calls` (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, set to `0` to disable). For auditable role routing, prefix spawned prompts with `[agent_type:<role>]` and set `spawn_agent.agent_type` when supported; lifecycle validation enforces prompt-role evidence and validates `agent_type` when present (`RLM_SYMBOLIC_MULTI_AGENT_ROLE_POLICY=warn|off`, legacy alias `RLM_COLLAB_ROLE_POLICY`; `RLM_SYMBOLIC_MULTI_AGENT_ALLOW_DEFAULT_ROLE=1`, legacy alias `RLM_COLLAB_ALLOW_DEFAULT_ROLE`). `codex-orchestrator codex setup` remains available when you want a managed/pinned CLI path (opt-in via `CODEX_CLI_USE_MANAGED=1`).
+Symbolic subcalls can optionally use collab tools. Fast path: `codex-orchestrator rlm --multi-agent auto "<goal>"` (legacy alias: `--collab auto`; sets `RLM_SYMBOLIC_MULTI_AGENT=1` plus legacy `RLM_SYMBOLIC_COLLAB=1` for compatibility, and implies symbolic mode). Collab requires `multi_agent=true` in `codex features list` (`collab` remains a legacy alias). Collab tool calls parsed from `codex exec --json --enable multi_agent` are stored in `manifest.collab_tool_calls` (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, set to `0` to disable); when present in events, `spawn_agent.fork_context` is captured for observability and surfaced in `codex-orchestrator doctor --usage` fork-context counters. For auditable role routing, prefix spawned prompts with `[agent_type:<role>]` and set `spawn_agent.agent_type` when supported; lifecycle validation enforces prompt-role evidence and validates `agent_type` when present (`RLM_SYMBOLIC_MULTI_AGENT_ROLE_POLICY=warn|off`, legacy alias `RLM_COLLAB_ROLE_POLICY`; `RLM_SYMBOLIC_MULTI_AGENT_ALLOW_DEFAULT_ROLE=1`, legacy alias `RLM_COLLAB_ALLOW_DEFAULT_ROLE`). `codex-orchestrator codex setup` remains available when you want a managed/pinned CLI path (opt-in via `CODEX_CLI_USE_MANAGED=1`).
 For batch fan-out jobs, prefer native `spawn_agents_on_csv` before building custom orchestration wrappers.
 ### Delegation flow
@@ -215,10 +226,12 @@ Options:
 - `--codex-home <path>` targets a different Codex home directory.
 Bundled skills (may vary by release):
+- `codex-orchestrator`
 - `collab-subagents-first`
 - `chrome-devtools`
 - `delegation-usage`
 - `standalone-review`
+- `elegance-review`
 - `docs-first`
 - `collab-evals`
 - `collab-deliberation`
@@ -276,7 +289,7 @@ codex-orchestrator doctor --cloud-preflight
 - Active PR watch-resolve-merge loop: `codex-orchestrator pr resolve-merge --pr <number> --quiet-minutes <window>` (add `--auto-merge` when approved; exits early when author action is required).
 - Passive PR monitor loop: `codex-orchestrator pr watch-merge --pr <number> --quiet-minutes <window>` (monitor-only behavior; keeps waiting unless terminal/timeout).
 - Review checkpoints (npm-only safe): `NOTES="Goal: ... | Summary: ... | Risks: ..." codex-orchestrator review --task <task-id>` for manifest-backed standalone review wrapper behavior (auto-skips repo-only diff-budget script when unavailable in downstream installs); use `codex review "<focus>"` for quick prompt-only checks; use `codex-orchestrator start implementation-gate --task <task-id> --format json` when you want a full gate run.
-- Downstream simulation before shipping wrapper/skill changes: `npm run pack:smoke` (packaged CLI in temp mock repo; validates `review` artifacts and `long-poll-wait` install path).
+- Downstream simulation before shipping wrapper/skill changes: `npm run pack:smoke` (packaged CLI in temp mock repo; validates `review` artifacts and `long-poll-wait` install path; spot-check gate). Use `npm run pack:audit` for full tarball inventory validation.
 - Delegation: `codex-orchestrator doctor --apply --yes`, then enable for a Codex run with: `codex -c 'mcp_servers.delegation.enabled=true' ...`
 - Collab (symbolic RLM subagents): `codex-orchestrator rlm --multi-agent auto "<goal>"` (legacy alias: `--collab auto`; requires Codex `features.multi_agent=true`)
 - Cloud: set `CODEX_CLOUD_ENV_ID` (and optional `CODEX_CLOUD_BRANCH`), then run: `codex-orchestrator start <pipeline> --cloud --target <stage-id>`

package/dist/orchestrator/src/cli/doctorUsage.js CHANGED Viewed

@@ -31,6 +31,9 @@ export async function runDoctorUsage(options = {}) {
     const collabByEventType = {};
     const collabTools = new Map();
     const collabCaptureDisabled = String(process.env.CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS ?? '').trim() === '0';
+    let collabSpawnForkContextTrue = 0;
+    let collabSpawnForkContextFalse = 0;
+    let collabSpawnForkContextUnknown = 0;
     let collabRunsWithUnclosedSpawnAgents = 0;
     let collabUnclosedSpawnAgents = 0;
     let collabRunsWithSpawnThreadLimitFailures = 0;
@@ -160,6 +163,15 @@ export async function runDoctorUsage(options = {}) {
                     continue;
                 }
                 if (tool === 'spawn_agent') {
+                    if (entry?.fork_context === true) {
+                        collabSpawnForkContextTrue += 1;
+                    }
+                    else if (entry?.fork_context === false) {
+                        collabSpawnForkContextFalse += 1;
+                    }
+                    else {
+                        collabSpawnForkContextUnknown += 1;
+                    }
                     if (isFailed) {
                         const rawFailedSpawnId = typeof entry?.item_id === 'string' ? entry.item_id.trim() : '';
                         const failedSpawnId = rawFailedSpawnId.length > 0 && rawFailedSpawnId !== 'unknown'
@@ -290,6 +302,9 @@ export async function runDoctorUsage(options = {}) {
             by_event_type: collabByEventType,
             top_tools: collabTopTools,
             capture_disabled: collabCaptureDisabled,
+            spawn_agent_fork_context_true: collabSpawnForkContextTrue,
+            spawn_agent_fork_context_false: collabSpawnForkContextFalse,
+            spawn_agent_fork_context_unknown: collabSpawnForkContextUnknown,
             runs_with_unclosed_spawn_agents: collabRunsWithUnclosedSpawnAgents,
             unclosed_spawn_agents: collabUnclosedSpawnAgents,
             runs_with_spawn_thread_limit_failures: collabRunsWithSpawnThreadLimitFailures,
@@ -357,9 +372,10 @@ export function formatDoctorUsageSummary(result) {
     const collabLifecycleUnknownSignal = collabLifecycleUnknownRuns > 0
         ? `, lifecycle_unknown_runs=${collabLifecycleUnknownRuns}`
         : '';
+    const collabForkContextSignal = `, fork_context=${result.collab.spawn_agent_fork_context_true}/${result.collab.spawn_agent_fork_context_false}/${result.collab.spawn_agent_fork_context_unknown}`;
     const collabToolList = formatTopList(result.collab.top_tools.map((entry) => ({ key: entry.tool, value: entry.calls })), 3, 'tools');
     lines.push(`  - collab: ${result.collab.runs_with_tool_calls} (${formatPercent(result.collab.runs_with_tool_calls, result.runs.total)})${collabSuffix}`
-        + `${collabTaskSuffix}, events=${result.collab.total_tool_calls}${collabAvg} (ok=${collabOk}, failed=${collabFailed}${collabLeakSignal}${collabThreadLimitSignal}${collabLifecycleUnknownSignal})${collabToolList}`);
+        + `${collabTaskSuffix}, events=${result.collab.total_tool_calls}${collabAvg} (ok=${collabOk}, failed=${collabFailed}${collabLeakSignal}${collabThreadLimitSignal}${collabLifecycleUnknownSignal}${collabForkContextSignal})${collabToolList}`);
     if (result.delegation.active_top_level_tasks > 0) {
         lines.push(`  - delegation: ${result.delegation.active_with_subagents}/${result.delegation.active_top_level_tasks} top-level tasks have subagent manifests (${result.delegation.total_subagent_manifests} total); child_runs=${result.delegation.total_child_runs} over ${result.delegation.tasks_with_child_runs} tasks`);
     }

package/dist/orchestrator/src/cli/services/commandRunner.js CHANGED Viewed

@@ -516,6 +516,7 @@ function parseCollabToolCallLine(line, stageId, commandIndex) {
         sender_thread_id: typeof item.sender_thread_id === 'string' ? item.sender_thread_id : 'unknown',
         receiver_thread_ids: receiverThreadIds,
         prompt: typeof item.prompt === 'string' ? item.prompt : null,
+        fork_context: typeof item.fork_context === 'boolean' ? item.fork_context : null,
         agents_states: item.agents_states && typeof item.agents_states === 'object'
             ? item.agents_states
             : null

package/docs/README.md CHANGED Viewed

@@ -26,6 +26,8 @@ Codex Orchestrator is the coordination layer that glues together Codex-driven ag
 ## How It Works
 - **Planner → Builder → Tester → Reviewer:** The core `TaskManager` (see `orchestrator/src/manager.ts`) wires together agent interfaces that decide *what* to run (planner), execute the selected pipeline stage (builder), verify results (tester), and give a final decision (reviewer).
 - **Execution modes:** Each plan item can flag `requires_cloud` and task metadata can set `execution.parallel`; the mode policy picks `mcp` (local MCP runtime) or `cloud` execution accordingly. Cloud runs perform a quick preflight (env id, codex availability, optional remote branch) and fall back to `mcp` with both summary text and a structured `cloud_fallback` manifest block when preflight fails.
+- **Runtime provider modes:** `runtimeMode=cli|appserver` is orthogonal to `executionMode`; local default runtime is `appserver` with `cli` break-glass support preserved. Explicit `executionMode=cloud + runtimeMode=appserver` remains unsupported and fails fast.
+- **Advanced feature posture:** `js_repl` is enabled by default globally (local + cloud lanes). For deterministic cloud contracts, pin explicit feature lanes (`CODEX_CLOUD_ENABLE_FEATURES=js_repl` and separate `CODEX_CLOUD_DISABLE_FEATURES=js_repl` runs). Use `CODEX_CLOUD_DISABLE_FEATURES=js_repl` for task-scoped cloud break-glass; reserve `codex features disable js_repl` for global emergency toggles and re-enable with `codex features enable js_repl`; `memories` remains scoped to explicit eval lanes (legacy alias `memory_tool` is compatibility-only).
 - **Event-driven persistence:** Milestones emit typed events on `EventBus`. `PersistenceCoordinator` captures run summaries in the task state store and writes manifests so nothing is lost if the process crashes.
 - **CLI lifecycle:** `CodexOrchestrator` (in `orchestrator/src/cli/orchestrator.ts`) resolves instruction sources (`AGENTS.md`, `docs/AGENTS.md`, `.agent/AGENTS.md`), loads the chosen pipeline, executes each command stage via `runCommandStage`, and keeps heartbeats plus command status current inside the manifest (approval evidence will surface once prompt wiring lands).
 - **Control-plane & scheduler integrations:** Optional validation (`control-plane/`) and scheduling (`scheduler/`) modules enrich manifests with drift checks, plan assignments, and remote run metadata.
@@ -102,6 +104,7 @@ Use `npx @kbediako/codex-orchestrator resume --run <run-id>` to continue interru
 - `codex-orchestrator mcp serve [--repo <path>] [--dry-run] [-- <extra args>]`: launch the MCP stdio server (delegates to `codex mcp-server`; stdout guard keeps protocol-only output, logs to stderr).
 - `codex-orchestrator init codex [--cwd <path>] [--force]`: copy starter templates into a repo (includes `mcp-client.json`, `AGENTS.md`, downstream .codex/config.toml + .codex/agents/* role files sourced from `templates/codex/.codex/*`, and `codex.orchestrator.json`; no overwrite unless `--force`).
 - `codex-orchestrator setup [--yes] [--refresh-skills]`: one-shot bootstrap for downstream users (installs bundled skills, configures delegation + DevTools wiring, and prints policy/usage guidance). By default, setup does not overwrite existing skills; add `--refresh-skills` when you want to replace existing bundled skill files.
+- Canonical bundled skill roster lives in `README.md` ("Bundled skills" section) and shipped files under `skills/`.
 - `codex-orchestrator start [pipeline] [--auto-issue-log] [--repo-config-required]`: starts a pipeline run. `--auto-issue-log` writes failure bundles automatically (including setup failures before manifest creation); `--repo-config-required` disables packaged config fallback.
 - `codex-orchestrator flow [--task <task-id>] [--auto-issue-log] [--repo-config-required]`: runs `docs-review` then `implementation-gate` in sequence; stops on the first failure. `--auto-issue-log` writes failure bundles automatically (including setup failures before manifest creation); `--repo-config-required` disables packaged config fallback.
 - `codex-orchestrator doctor [--format json] [--usage] [--cloud-preflight] [--issue-log] [--apply]`: check optional tooling dependencies plus collab/cloud/delegation readiness and print enablement commands. `--usage` appends a local usage snapshot (scans `.runs/`) with adoption KPIs. `--issue-log` appends/creates `docs/codex-orchestrator-issues.md` (or `--issue-log-path`) and writes a JSON bundle under `out/<resolved-task>/doctor/issue-bundles/` with doctor context plus latest run context when available. `--apply` plans/applies quick fixes (use with `--yes`).
@@ -113,7 +116,7 @@ Use `npx @kbediako/codex-orchestrator resume --run <run-id>` to continue interru
 ## Publishing (npm)
 - Pack audit: `npm run pack:audit` (validates the tarball file list; run `npm run clean:dist && npm run build` first if `dist/` contains non-runtime artifacts).
-- Pack smoke: `npm run pack:smoke` (installs the tarball in a temp mock repo, runs CLI behavior checks including `review` artifacts and `long-poll-wait` skill install, and validates delegate-server JSONL; uses network).
+- Pack smoke: `npm run pack:smoke` (installs the tarball in a temp mock repo, runs CLI behavior checks including `review` artifacts and `long-poll-wait` skill install, and validates delegate-server JSONL; uses network). Treat this as a spot-check gate; use `npm run pack:audit` for full tarball inventory validation.
 - Release tags: `vX.Y.Z` or `vX.Y.Z-alpha.N` must match `package.json` version.
 - Dist-tags: stable publishes to `latest`; alpha publishes to `alpha` and uses a GitHub prerelease.
 - Publishing auth: workflow attempts OIDC trusted publishing first (`id-token: write` + `--provenance`), then falls back to `secrets.NPM_TOKEN` when OIDC is unavailable. `secrets.NPM_TOKEN` must be an npm automation token (not a token that requires OTP).
@@ -192,7 +195,7 @@ Notes:
 - `TaskStateStore` writes per-task snapshots with bounded lock retries; failures degrade gracefully while still writing the main manifest.
 - `RunManifestWriter` generates the canonical manifest JSON for each run (mirrored under `.runs/`), while metrics appenders and summary writers keep `out/` up to date.
 - `run-summary.json` now carries `usageKpi` run-level signals (cloud/collab/delegation/rlm indicators) and `cloudFallback` details when a cloud request is downgraded to MCP.
-- `collab_tool_calls` in the manifest captures collab tool call JSONL lines extracted from command stdout (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, default 200; set 0 to disable capture). For `spawn_agent` calls, keep prompt-role intent explicit (first-line `[agent_type:<role>]`) and set `agent_type` when supported so routing remains auditable even when event payloads omit `agent_type`.
+- `collab_tool_calls` in the manifest captures collab tool call JSONL lines extracted from command stdout (bounded by `CODEX_ORCHESTRATOR_COLLAB_MAX_EVENTS`, default 200; set 0 to disable capture). For `spawn_agent` calls, keep prompt-role intent explicit (first-line `[agent_type:<role>]`) and set `agent_type` when supported so routing remains auditable even when event payloads omit `agent_type`; keep `fork_context` disabled by default and enable it only for streams that require inherited thread history. When emitted upstream, `spawn_agent.fork_context` is persisted and summarized by `codex-orchestrator doctor --usage` counters (`true/false/unknown`) to support evidence-based policy decisions.
 - Heartbeat files and timestamps guard against stalled runs. `orchestrator/src/cli/metrics/metricsRecorder.ts` aggregates command durations, exit codes, and guardrail stats for later review.
 - Optional caps: `CODEX_ORCHESTRATOR_EXEC_EVENT_MAX_CHUNKS` limits captured exec chunk events per command (defaults to 500; set 0 for no cap), `CODEX_ORCHESTRATOR_TELEMETRY_MAX_EVENTS` caps in-memory telemetry events queued before flush (defaults to 1000; set 0 for no cap), and `CODEX_METRICS_PRIVACY_EVENTS_MAX` limits privacy decision events stored in `metrics.json` (-1 = no cap; `privacy_event_count` still reflects total).
@@ -212,11 +215,11 @@ Note: the commands below assume a source checkout; `scripts/` helpers are not in
 | `npm run eval:test` | Optional evaluation harness (enable when `evaluation/fixtures/**` is populated). |
 | `npm run docs:check` | Deterministically validates scripts/pipelines/paths referenced in agent-facing docs. |
 | `npm run docs:freshness` | Validates docs registry coverage + review recency; writes `out/<task-id>/docs-freshness.json`. |
-| `npm run ci:cloud-canary` | Runs the cloud canary harness (`scripts/cloud-canary-ci.mjs`) to verify cloud lifecycle manifest + run-summary evidence; credential-gated by `CODEX_CLOUD_ENV_ID` and optional auth secrets (`CODEX_CLOUD_BRANCH` defaults to `main`). Feature flags can be passed through with `CODEX_CLOUD_ENABLE_FEATURES` / `CODEX_CLOUD_DISABLE_FEATURES` (comma- or space-delimited, e.g. `sqlite,memory_tool`). |
+| `npm run ci:cloud-canary` | Runs the cloud canary harness (`scripts/cloud-canary-ci.mjs`) to verify cloud lifecycle manifest + run-summary evidence; credential-gated by `CODEX_CLOUD_ENV_ID` and optional auth secrets (`CODEX_CLOUD_BRANCH` defaults to `main`). Feature flags can be passed through with `CODEX_CLOUD_ENABLE_FEATURES` / `CODEX_CLOUD_DISABLE_FEATURES` (comma- or space-delimited, e.g. `sqlite,memories`). |
 | `node scripts/delegation-guard.mjs` | Enforces subagent delegation evidence before review (repo-only). |
 | `node scripts/spec-guard.mjs --dry-run` | Validates spec freshness; required before review (repo-only). |
 | `node scripts/diff-budget.mjs` | Guards against oversized diffs before review (repo-only; defaults: 25 files / 800 lines; supports explicit overrides). |
-| `npm run pack:smoke` | Downstream simulation gate for npm consumers (tarball install in temp mock repo, `review` wrapper artifacts, delegate-server JSONL, and `skills install --only long-poll-wait`). Core lane runs it automatically when downstream-facing paths change, and `.github/workflows/pack-smoke-backstop.yml` runs a weekly `main` backstop. |
+| `npm run pack:smoke` | Downstream simulation gate for npm consumers (tarball install in temp mock repo, `review` wrapper artifacts, delegate-server JSONL, and `skills install --only long-poll-wait`). Spot-check gate; pair with `npm run pack:audit` when you need full tarball inventory coverage. Core lane runs it automatically when downstream-facing paths change, and `.github/workflows/pack-smoke-backstop.yml` runs a weekly `main` backstop. |
 | `codex-orchestrator review` | Runs the standalone review wrapper with task-scoped manifest evidence; delegation MCP is enabled by default (explicit disable available via `CODEX_REVIEW_DISABLE_DELEGATION_MCP=1` / `--disable-delegation-mcp`), runtime guards are opt-in via `CODEX_REVIEW_*` env vars, and patience-first checkpoints log by default (`CODEX_REVIEW_MONITOR_INTERVAL_SECONDS` tunes/disables). Large uncommitted scopes get an automatic prompt advisory (`CODEX_REVIEW_LARGE_SCOPE_FILE_THRESHOLD` / `CODEX_REVIEW_LARGE_SCOPE_LINE_THRESHOLD`). Optional auto failure issue logging via `CODEX_REVIEW_AUTO_ISSUE_LOG=1` or `--auto-issue-log`. |
 | `npm run review` | Runs `codex review` with task-scoped manifest evidence; delegation MCP is enabled by default (explicit disable available via `CODEX_REVIEW_DISABLE_DELEGATION_MCP=1` / `--disable-delegation-mcp`), runtime guards are opt-in via `CODEX_REVIEW_*` env vars, and patience-first checkpoints log by default (`CODEX_REVIEW_MONITOR_INTERVAL_SECONDS` tunes/disables). Large uncommitted scopes get an automatic prompt advisory (`CODEX_REVIEW_LARGE_SCOPE_FILE_THRESHOLD` / `CODEX_REVIEW_LARGE_SCOPE_LINE_THRESHOLD`). Optional auto failure issue logging via `CODEX_REVIEW_AUTO_ISSUE_LOG=1` or `--auto-issue-log`. |

package/package.json CHANGED Viewed

@@ -1,10 +1,10 @@
 {
   "name": "@kbediako/codex-orchestrator",
-  "version": "0.1.37",
+  "version": "0.1.38",
   "license": "MIT",
   "repository": {
     "type": "git",
-    "url": "https://github.com/Kbediako/CO"
+    "url": "git+https://github.com/Kbediako/CO.git"
   },
   "homepage": "https://github.com/Kbediako/CO#readme",
   "bugs": {
@@ -51,6 +51,7 @@
     "docs:freshness": "node scripts/docs-freshness.mjs --check",
     "docs:sync": "node --loader ts-node/esm scripts/docs-hygiene.ts --sync",
     "ci:cloud-canary": "node scripts/cloud-canary-ci.mjs",
+    "canary:js-repl-usage": "node scripts/js-repl-usage-matrix.mjs",
     "canary:runtime": "node scripts/runtime-mode-canary.mjs",
     "prelint": "node scripts/build-patterns-if-needed.mjs",
     "lint": "eslint orchestrator/src orchestrator/tests packages/orchestrator/src packages/orchestrator/tests packages/shared adapters evaluation/harness evaluation/tests --ext .ts,.tsx",
@@ -126,5 +127,13 @@
     "ink": "^4.4.1",
     "js-yaml": "^4.1.0",
     "react": "^18.3.1"
-  }
+  },
+  "description": "![Setup demo](docs/assets/setup.gif)",
+  "main": "dist/orchestrator/src/cli/orchestrator.js",
+  "directories": {
+    "doc": "docs",
+    "test": "tests"
+  },
+  "keywords": [],
+  "author": ""
 }

package/schemas/manifest.json CHANGED Viewed

@@ -880,6 +880,7 @@
           "items": { "type": "string", "minLength": 1 }
         },
         "prompt": { "type": ["string", "null"] },
+        "fork_context": { "type": ["boolean", "null"] },
         "agents_states": {
           "type": ["object", "null"],
           "additionalProperties": true

package/skills/chrome-devtools/SKILL.md CHANGED Viewed

@@ -36,5 +36,5 @@ Use this skill when you need browser-grounded evidence (UI screenshots, console
 - `standalone-review`: route ad-hoc review checks through a manifest-backed review loop when findings need auditability.
 - `collab-subagents-first`: isolate heavy browser exploration in a dedicated subagent stream to protect parent context.
-- `frontend-design-review`: use when the task emphasis is structured UI/UX critique with evidence-backed recommendations.
+- `frontend-design-review`: optional global skill (not bundled in CO release); use when the task emphasis is structured UI/UX critique with evidence-backed recommendations.
 - `long-poll-wait`: monitor long-running browser-driven checks or CI replay loops to terminal state.

package/skills/codex-orchestrator/SKILL.md ADDED Viewed

@@ -0,0 +1,83 @@
+---
+name: codex-orchestrator
+description: "Primary entrypoint for Codex Orchestrator usage: route tasks to the right pipeline, mode, and supporting skills with minimal, auditable steps."
+---
+# Codex-Orchestrator Workflow Router
+## Overview
+Use this skill as the default entrypoint for work in CO or downstream repos using `@kbediako/codex-orchestrator`. It routes intent to the smallest correct command path and points to specialized skills only when needed.
+## Core Contract
+- Keep MCP as the control plane by default.
+- Use docs-first before implementation edits.
+- Use delegation early for non-trivial work with bounded stream ownership.
+- Keep runtime and execution modes explicit and orthogonal:
+  - `runtimeMode=cli|appserver`
+  - `executionMode=mcp|cloud`
+## Default Command Path
+For most task-scoped work:
+- `codex-orchestrator flow --task <task-id>`
+- `codex-orchestrator doctor --usage --window-days 30 --task <task-id>`
+- `codex-orchestrator review --task <task-id>`
+For explicit stage control:
+- `codex-orchestrator start docs-review --task <task-id> --format json`
+- `codex-orchestrator start implementation-gate --task <task-id> --format json`
+- `codex-orchestrator status --run <run-id> --watch --interval 10`
+## Intent Router
+1) Task/spec scaffolding and mirror sync:
+- Use `docs-first`.
+2) Delegation setup/run-control and subagent evidence discipline:
+- Use `delegation-usage`.
+3) Stream decomposition across independent bounded work:
+- Use `collab-subagents-first`.
+4) Option analysis, tradeoffs, and decision framing before implementation:
+- Use `collab-deliberation`.
+5) Long-running checks/reviews/cloud jobs that need patience-first monitoring:
+- Use `long-poll-wait`.
+6) Implementation checkpoint reviews and final handoff:
+- Use `standalone-review`, then `elegance-review`.
+7) Release/tag/publish workflows:
+- Use `release`.
+8) Collab/multi-agent scenario testing and evidence capture:
+- Use `collab-evals`.
+## Feature Posture
+- `js_repl` is default-on globally (local + cloud lanes); use explicit cloud feature lanes for deterministic contracts.
+- `memories` stays scoped to explicit eval lanes.
+- Subagent context forking (`fork_context`) is guidance-first: keep it `false` for bounded streams, and set `true` only when the child must inherit prior thread history.
+- Compatibility note: upstream still accepts the legacy alias `memory_tool`; use `memories` in new CO guidance unless documenting legacy compatibility behavior.
+## Related Docs
+- `AGENTS.md`
+- `docs/AGENTS.md`
+- `README.md`
+- `docs/README.md`
+## Related Skills
+- `docs-first`
+- `delegation-usage`
+- `collab-subagents-first`
+- `collab-deliberation`
+- `standalone-review`
+- `elegance-review`
+- `long-poll-wait`
+- `release`
+- `agent-first-adoption-steering`

package/skills/collab-subagents-first/SKILL.md CHANGED Viewed

@@ -102,6 +102,7 @@ Skip subagents when all conditions are true:
   - `message` (plain text), or
   - `items` (structured input).
 - Do not send both `message` and `items` in one spawn call.
+- Keep `fork_context` disabled by default to preserve bounded context. Enable `fork_context=true` only when the subagent needs prior thread history that would otherwise be costly/risky to restate.
 - `spawn_agent` falls back to `default` when `agent_type` is omitted; always set `agent_type` explicitly.
 - Prefix spawned prompts with `[agent_type:<role>]` on line one so role intent is auditable from collab JSONL/manifests.
 - Use `items` when you need explicit structured context (for example `mention` paths like `app://...` or selected `skill` entries) instead of flattening everything into one long string.

package/skills/delegation-usage/SKILL.md CHANGED Viewed

@@ -26,6 +26,7 @@ Multi-agent (collab tools) mode is separate from delegation. For symbolic RLM su
 - Do not send both `message` and `items` in the same `spawn_agent` call.
 - `spawn_agent` falls back to `default` when `agent_type` is omitted; always set `agent_type` explicitly.
 - For auditable role routing, prefix spawned prompts with `[agent_type:<role>]` on the first line and keep it aligned with `agent_type`.
+- Keep `fork_context` disabled by default for bounded streams; use `fork_context=true` only when the child must inherit prior thread context.
 - Spawn returns an `agent_id` (thread id). Current TUI collab rendering is id-based; do not depend on custom visible agent names.
 - Subagents spawned through collab run with approval effectively set to `never`; design child tasks to avoid approval/escalation requirements.
 - Collab spawn depth is bounded. Near/at max depth, recursive delegation can fail or collab can be disabled in children; prefer shallow parent fan-out.

package/templates/codex/.codex/agents/explorer-fast.toml CHANGED Viewed

@@ -1,2 +1,3 @@
 model = "gpt-5.3-codex-spark"
 model_reasoning_effort = "xhigh"
+model_reasoning_summary = "none"