npm - mcp-codex-worker - Versions diffs - 1.0.4 → 1.0.5 - Mend

mcp-codex-worker 1.0.4 → 1.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +109 -118
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,29 +1,26 @@
 # mcp-codex-worker
-A stdio MCP server that bridges MCP clients to the Codex app-server runtime. Provides **5 task tools** for provider-agnostic task orchestration — spawn, wait, respond, message, cancel. Does not call OpenAI APIs directly — all work is delegated to `codex app-server`.
+A stdio MCP server that bridges MCP clients to the Codex app-server runtime. Provides **5 task tools** for fully autonomous agent orchestration — spawn, wait, respond, message, cancel. All commands and file edits are auto-approved. Full observability via file-backed reporting and MCP resource subscriptions.
-## Install
-### MCP server
-```bash
-npx -y mcp-codex-worker
-```
+Does not call OpenAI APIs directly — all work is delegated to `codex app-server` (see [codex-app-server protocol reference](https://github.com/openai/codex/blob/main/codex-rs/app-server/README.md)).
-Add to Claude Code globally:
+## Install
 ```bash
-claude mcp add codex-worker --scope user -- npx -y mcp-codex-worker
+npx -y mcp-codex-worker@latest
 ```
-Add to any MCP client config (Claude Desktop, VS Code, Cursor, etc.):
+Add to Claude Code:
 ```json
 {
   "mcpServers": {
     "codex-worker": {
       "command": "npx",
-      "args": ["-y", "mcp-codex-worker"]
+      "args": ["-y", "mcp-codex-worker@latest"],
+      "env": {
+        "CODEX_LB_API_KEY": "${CODEX_LB_API_KEY}"
+      }
     }
   }
 }
@@ -31,172 +28,166 @@ Add to any MCP client config (Claude Desktop, VS Code, Cursor, etc.):
 ### Companion skill (optional)
-The `run-codex-subagents` skill teaches AI agents how to orchestrate tasks through this server — wave execution, approval handling, parallel dispatch, and more.
 ```bash
 npx -y skills add -y -g yigitkonur/skills-by-yigitkonur/skills/run-codex-subagents
 ```
-Or install the full skills pack:
-```bash
-npx -y skills add -y -g yigitkonur/skills-by-yigitkonur
-```
-The skill is also bundled at `skills/run-codex-subagents/` in this repo for reference.
 ## Requirements
 - Node 22+
-- `codex` CLI installed and authenticated
+- `codex` CLI installed
+- `CODEX_LB_API_KEY` or `OPENAI_API_KEY` in environment
-## Unified task tools
-The primary interface. Provider-agnostic — tasks route to Codex today, Copilot and Claude CLI in Phase 2.
+## Tools
 | Tool | Purpose |
 |---|---|
-| `spawn-task` | Create and start a coding task. Returns immediately with a task_id. |
-| `wait-task` | Block until a task completes, fails, or needs input. |
-| `respond-task` | Answer an agent's question or approve a pending action. |
-| `message-task` | Send a follow-up message to an active task. |
+| `spawn-task` | Create and start a task. Returns `task_id`, `disk_paths`, and "what to do next" guidance. |
+| `wait-task` | Block until completion/failure/input. Returns output, liveness signals, and next actions. |
+| `respond-task` | Answer agent questions (`user_input` only — approvals are auto-handled). |
+| `message-task` | Send follow-up instructions to a running task. |
 | `cancel-task` | Cancel one or more tasks (single or batch). |
+### Model selection
+Only `gpt-5.4` — pass the `reasoning` parameter:
+| Value | Use case |
+|---|---|
+| `gpt-5.4(medium)` | Default. Most coding, refactors, debugging. |
+| `gpt-5.4(high)` | Multi-file reasoning, subtle bugs, design decisions. |
+| `gpt-5.4(xhigh)` | Deep research, novel architecture. |
+| `gpt-5.4(low)` | Trivial mechanical edits. |
 ### Typical workflow
 ```
-spawn-task(prompt, cwd)           → task_id, status
-wait-task(task_id)                → completed | input_required | failed
-respond-task(task_id, type, ...)  → task resumes (if paused)
-wait-task(task_id)                → completed
+spawn-task { prompt: "..." }
+  → { task_id: "bold-eagle-456", status: "working", disk_paths: {...} }
+    ---
+    **What to do next:**
+    - Call `wait-task` with task_id to block until done.
+    - If you have more tasks, launch them now — all run in parallel.
+wait-task { task_id: "bold-eagle-456" }
+  → { status: "completed", output: [...] }
+    ---
+    **What to do next:**
+    - Read `task:///bold-eagle-456` for the full result.
 ```
-### spawn-task
+No `respond-task` needed for command/file approvals — they're auto-approved instantly.
-Create and start a task. The agent begins working immediately.
+## Auto-Approval
-| Parameter | Type | Required | Description |
-|---|---|---|---|
-| `prompt` | string | yes | What the task should do. Be specific — include file paths, function names. |
-| `cwd` | string | no | Working directory. Agent sees files here. |
-| `task_type` | enum | no | `coder` (default), `planner`, `tester`, `researcher`, `general` |
-| `model` | string | no | Override provider default model. |
-| `timeout_ms` | integer | no | Max execution time (1,000–3,600,000 ms). |
-| `developer_instructions` | string | no | System-level constraints injected before the prompt. |
-| `labels` | string[] | no | Arbitrary labels for filtering. |
-| `depends_on` | string[] | no | Task IDs that must complete first. |
-| `context_files` | array | no | Files to include: `[{ path, description? }]` |
+The server auto-approves ALL Codex approval requests without waiting:
-Returns: `{ task_id, status, poll_frequency, provider_session_id, resources }`
+| Request | Response | Log |
+|---|---|---|
+| Command approval | `accept` | `[auto-approve] cmd: npm test` |
+| File change | `accept` | `[auto-approve] file change` |
+| MCP elicitation | `accept` | `[auto-approve] elicitation` |
+| Permission request | `grant all for session` | `[auto-approve] permissions` |
-### wait-task
+Only `user_input` (agent asks a question) and `dynamic_tool` (external tool needs data) still pause the task.
-Block until a task reaches a terminal state or `input_required`.
+Combined with `approval_policy = "never"` and `sandbox_mode = "danger-full-access"` from your `config.toml`, agents run fully autonomously.
-| Parameter | Type | Required | Default |
-|---|---|---|---|
-| `task_id` | string | yes | — |
-| `timeout_ms` | integer | no | 30,000 |
-| `poll_interval_ms` | integer | no | 1,000 |
+## Reporting & Observability
-Returns: `{ task_id, status, provider_session_id, pending_question?, output? }`
+Every task persists to `~/.mcp-codex-worker/tasks/<task-id>/`:
-### respond-task
+| File | Content |
+|---|---|
+| `meta.json` | Task state snapshot (refreshes every 10s during activity) |
+| `summary.log` | Formatted one-liners: `[HH:MM:SS] cmd: npm test (exit 0, 1.2s)` |
+| `verbose.log` | Full execution trace, streaming command output |
+| `events.jsonl` | Raw Codex notification stream — every event with timestamp |
-Respond to a paused task. The `type` field must match the `pending_question.type` from wait-task.
+### MCP Resources
-| Type | When | Key fields |
-|---|---|---|
-| `user_input` | Agent has questions | `answers: { "key": "value" }` |
-| `command_approval` | Agent wants to run a command | `decision: "accept" \| "reject"` |
-| `file_approval` | Agent wants to modify files | `decision: "accept" \| "reject"` |
-| `elicitation` | MCP server needs confirmation | `action: "accept" \| "decline"` |
-| `dynamic_tool` | Agent invoked an external tool | `result: "..."` or `error: "..."` |
+| URI | Content |
+|---|---|
+| `task:///all` | Scoreboard (auto-subscribed, push on change) |
+| `task:///{id}` | Task detail with metadata |
+| `task:///{id}/log` | Summary log |
+| `task:///{id}/log.verbose` | Verbose log (reads from disk) |
+| `task:///{id}/events` | Raw event trace (JSONL from disk) |
-### message-task
+### Resource Subscriptions
-Send a follow-up to an active task. Only works on non-terminal tasks.
+```json
+{ "resources": { "subscribe": true, "listChanged": true } }
+```
-| Parameter | Type | Required |
-|---|---|---|
-| `task_id` | string | yes |
-| `message` | string | yes |
-| `model` | string | no |
+- `task:///all` auto-subscribed at startup
+- `notifications/resources/updated` on status changes (immediate) and output (throttled 1s)
+- `notifications/resources/list_changed` on task creation
-### cancel-task
+## Codex App-Server Integration
-Cancel one or many tasks.
+This server acts as a client to the [Codex app-server](https://github.com/openai/codex/blob/main/codex-rs/app-server/README.md) protocol. Key integration points:
-| Parameter | Type | Required |
-|---|---|---|
-| `task_id` | string or string[] | yes |
+| Feature | How we use it |
+|---|---|
+| `initialize` | `experimentalApi: true`, then `account/login/start` with API key |
+| `thread/start` | Omit `approvalPolicy`/`sandbox` — config.toml takes effect |
+| `turn/start` | Pass `effort` for reasoning level |
+| Approval requests | Auto-respond via `respondToServerRequest()` |
+| `turn/completed` | Captures turn status in summary log |
+| `item/*` notifications | Event capture module processes 6 types into logs |
+| `error` notification | Captures `codexErrorInfo` classification |
+| stderr | Piped to verbose.log and events.jsonl for diagnostics |
+| Process exit | Detects exit code/signal, classifies auth errors |
-Returns: `{ cancelled: [...], already_terminal: [...], not_found: [...] }`
+### Auth
-## Task resources
+On startup, sends `account/login/start` with `CODEX_LB_API_KEY` (or `OPENAI_API_KEY`) to switch from stale OAuth tokens to API key auth. This prevents the "refresh token already used" crash.
-| URI | Description |
-|---|---|
-| `task:///all` | Scoreboard — all tasks with status badges and elapsed time |
-| `task:///{id}` | Detail — metadata, provider session, timestamps, error |
-| `task:///{id}/log` | Summary log — last 20 output lines |
-| `task:///{id}/log.verbose` | Verbose log — full output history |
+### Crash Recovery
-### Wire states (SEP-1686)
+On restart, loads all `meta.json` files from disk. Terminal tasks restore as-is. Non-terminal tasks (RUNNING, WAITING_ANSWER) are marked UNKNOWN since Codex sessions don't survive restarts.
-All statuses returned by tools use these 7 values:
+## Wire States (SEP-1686)
 | State | Meaning |
 |---|---|
 | `submitted` | Queued, not started |
-| `working` | Agent is executing |
-| `input_required` | Paused, needs response |
+| `working` | Agent executing |
+| `input_required` | Paused — needs `user_input` or `dynamic_tool` response |
 | `completed` | Done |
-| `failed` | Error |
-| `cancelled` | Interrupted |
+| `failed` | Error (includes AUTH_TOKEN_EXPIRED, RATE_LIMITED classifications) |
+| `cancelled` | Interrupted by user |
 | `unknown` | Crash recovery fallback |
-## Parallel execution
-Spawn multiple tasks simultaneously. Each runs in an independent agent workspace.
-```
-spawn-task(prompt: "implement auth module", cwd: "/project")   → task_a
-spawn-task(prompt: "implement billing module", cwd: "/project") → task_b
-spawn-task(prompt: "write e2e tests", cwd: "/project")          → task_c
-# Monitor via scoreboard
-read resource: task:///all
-→ tasks -- 3 total (1 done, 2 busy)
-# Wait for each
-wait-task(task_a) → completed
-wait-task(task_b) → completed
-wait-task(task_c) → completed
-```
-## Environment variables
+## Environment Variables
 | Variable | Description | Default |
 |---|---|---|
+| `CODEX_LB_API_KEY` | API key for codex-lb provider | — |
+| `OPENAI_API_KEY` | Fallback API key | — |
 | `CODEX_APP_SERVER_COMMAND` | Codex binary path | `codex` |
 | `CODEX_APP_SERVER_ARGS` | App-server arguments | `app-server --listen stdio://` |
 | `CODEX_HOME_DIRS` | Colon-separated profile roots for failover | `~/.codex` |
 | `CODEX_ENABLE_FLEET` | Enable fleet mode (sub-agent instructions) | off |
-## Local development
+## Development
 ```bash
 npm install
-npm run build
-npm run test:unit    # 158 tests
-npm run smoke        # requires codex CLI
+npm run build        # TypeScript compile
+npm test             # 158 unit tests
+npm run smoke        # live Codex test (needs CODEX_LB_API_KEY)
+npm run serve        # start locally
 ```
-### Contract tests (mcpc)
+### Testing with mcpc
 ```bash
-./test/mcpc/gherkin-tests.sh   # 45 scenarios, 84 assertions
+mcpc --config config.json cw connect @test
+mcpc @test tools-list
+mcpc @test resources-read "task:///all"
+mcpc @test tools-call spawn-task '{"prompt":"echo hello","reasoning":"gpt-5.4(low)"}'
 ```
-Requires [mcpc](https://github.com/nicobailey/mcpc) v0.1.11+.
+Use the local build (`node --import tsx src/index.ts`) for mcpc — `npx` has bridge startup timing issues.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mcp-codex-worker",
-  "version": "1.0.4",
+  "version": "1.0.5",
   "description": "MCP server bridge for Codex app-server",
   "type": "module",
   "main": "dist/src/index.js",