mcp-codex-worker 0.1.18 → 0.1.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,22 +1,22 @@
1
1
  # mcp-codex-worker
2
2
 
3
- a stdio MCP server that bridges MCP clients to the Codex app-server runtime. gives you thread management, turn control, request approval, and model selection through a clean MCP tool surface.
3
+ A stdio MCP server that bridges MCP clients to the Codex app-server runtime. Provides **5 task tools** for provider-agnostic task orchestration — spawn, wait, respond, message, cancel. Does not call OpenAI APIs directly — all work is delegated to `codex app-server`.
4
4
 
5
- does not call OpenAI APIs directly — all work is delegated to the `codex app-server`.
5
+ ## Install
6
6
 
7
- ## install
7
+ ### MCP server
8
8
 
9
9
  ```bash
10
10
  npx -y mcp-codex-worker
11
11
  ```
12
12
 
13
- or add to Claude Code globally:
13
+ Add to Claude Code globally:
14
14
 
15
15
  ```bash
16
16
  claude mcp add codex-worker --scope user -- npx -y mcp-codex-worker
17
17
  ```
18
18
 
19
- or add to any MCP client config:
19
+ Add to any MCP client config (Claude Desktop, VS Code, Cursor, etc.):
20
20
 
21
21
  ```json
22
22
  {
@@ -29,107 +29,174 @@ or add to any MCP client config:
29
29
  }
30
30
  ```
31
31
 
32
- ## requirements
32
+ ### Companion skill (optional)
33
33
 
34
- - node 22+
34
+ The `run-codex-subagents` skill teaches AI agents how to orchestrate tasks through this server — wave execution, approval handling, parallel dispatch, and more.
35
+
36
+ ```bash
37
+ npx -y skills add -y -g yigitkonur/skills-by-yigitkonur/skills/run-codex-subagents
38
+ ```
39
+
40
+ Or install the full skills pack:
41
+
42
+ ```bash
43
+ npx -y skills add -y -g yigitkonur/skills-by-yigitkonur
44
+ ```
45
+
46
+ The skill is also bundled at `skills/run-codex-subagents/` in this repo for reference.
47
+
48
+ ## Requirements
49
+
50
+ - Node 22+
35
51
  - `codex` CLI installed and authenticated
36
52
 
37
- ## tools
53
+ ## Unified task tools
38
54
 
39
- ### thread management
55
+ The primary interface. Provider-agnostic — tasks route to Codex today, Copilot and Claude CLI in Phase 2.
40
56
 
41
- | tool | description |
57
+ | Tool | Purpose |
42
58
  |---|---|
43
- | `thread-start` | create a new conversation thread each thread is an independent agent workspace |
44
- | `thread-resume` | resume an existing thread, optionally switching model or cwd |
45
- | `thread-read` | read thread state and conversation history |
46
- | `thread-list` | list recent threads for discovery |
59
+ | `spawn-task` | Create and start a coding task. Returns immediately with a task_id. |
60
+ | `wait-task` | Block until a task completes, fails, or needs input. |
61
+ | `respond-task` | Answer an agent's question or approve a pending action. |
62
+ | `message-task` | Send a follow-up message to an active task. |
63
+ | `cancel-task` | Cancel one or more tasks (single or batch). |
47
64
 
48
- ### turn control
65
+ ### Typical workflow
49
66
 
50
- | tool | description |
51
- |---|---|
52
- | `turn-start` | send a message to a thread, starting an autonomous agent turn |
53
- | `turn-steer` | redirect an in-progress turn with new instructions |
54
- | `turn-interrupt` | stop an active turn immediately |
67
+ ```
68
+ spawn-task(prompt, cwd) → task_id, status
69
+ wait-task(task_id) → completed | input_required | failed
70
+ respond-task(task_id, type, ...) → task resumes (if paused)
71
+ wait-task(task_id) → completed
72
+ ```
55
73
 
56
- ### request approval
74
+ ### spawn-task
57
75
 
58
- | tool | description |
59
- |---|---|
60
- | `request-list` | list pending server requests (command approvals, permissions, etc.) |
61
- | `request-read` | read details of a specific pending request |
62
- | `request-respond` | approve/decline/answer a pending request |
76
+ Create and start a task. The agent begins working immediately.
63
77
 
64
- ### introspection
78
+ | Parameter | Type | Required | Description |
79
+ |---|---|---|---|
80
+ | `prompt` | string | yes | What the task should do. Be specific — include file paths, function names. |
81
+ | `cwd` | string | no | Working directory. Agent sees files here. |
82
+ | `task_type` | enum | no | `coder` (default), `planner`, `tester`, `researcher`, `general` |
83
+ | `model` | string | no | Override provider default model. |
84
+ | `timeout_ms` | integer | no | Max execution time (1,000–3,600,000 ms). |
85
+ | `developer_instructions` | string | no | System-level constraints injected before the prompt. |
86
+ | `labels` | string[] | no | Arbitrary labels for filtering. |
87
+ | `depends_on` | string[] | no | Task IDs that must complete first. |
88
+ | `context_files` | array | no | Files to include: `[{ path, description? }]` |
65
89
 
66
- | tool | description |
67
- |---|---|
68
- | `model-list` | list available models |
69
- | `account-read` | read account details |
70
- | `account-rate-limits-read` | check rate limit status |
71
- | `skills-list` | list registered skills |
72
- | `app-list` | list available apps |
73
- | `wait` | block until an operation completes or a request appears |
90
+ Returns: `{ task_id, status, poll_frequency, provider_session_id, resources }`
74
91
 
75
- ## parallel execution
92
+ ### wait-task
76
93
 
77
- launch multiple threads simultaneously for parallel work:
94
+ Block until a task reaches a terminal state or `input_required`.
78
95
 
79
- ```
80
- thread-start → thread_id_1
81
- thread-start thread_id_2
82
- thread-start thread_id_3
96
+ | Parameter | Type | Required | Default |
97
+ |---|---|---|---|
98
+ | `task_id` | string | yes | — |
99
+ | `timeout_ms` | integer | no | 30,000 |
100
+ | `poll_interval_ms` | integer | no | 1,000 |
83
101
 
84
- turn-start(thread_id_1, "implement auth module...")
85
- turn-start(thread_id_2, "implement payment module...")
86
- turn-start(thread_id_3, "write e2e tests...")
87
- ```
102
+ Returns: `{ task_id, status, provider_session_id, pending_question?, output? }`
88
103
 
89
- each thread is fully isolated — they can work on different tasks concurrently without interfering.
104
+ ### respond-task
90
105
 
91
- ## resources
106
+ Respond to a paused task. The `type` field must match the `pending_question.type` from wait-task.
92
107
 
93
- | uri | description |
94
- |---|---|
95
- | `codex://threads` | latest threads from thread/list |
96
- | `codex://thread/{id}` | full thread with turns |
97
- | `codex://thread/{id}/events` | observed notifications for a thread |
98
- | `codex://models` | available models |
99
- | `codex://account` | account details and rate limits |
100
- | `codex://requests` | pending server requests |
108
+ | Type | When | Key fields |
109
+ |---|---|---|
110
+ | `user_input` | Agent has questions | `answers: { "key": "value" }` |
111
+ | `command_approval` | Agent wants to run a command | `decision: "accept" \| "reject"` |
112
+ | `file_approval` | Agent wants to modify files | `decision: "accept" \| "reject"` |
113
+ | `elicitation` | MCP server needs confirmation | `action: "accept" \| "decline"` |
114
+ | `dynamic_tool` | Agent invoked an external tool | `result: "..."` or `error: "..."` |
115
+
116
+ ### message-task
101
117
 
102
- ## environment variables
118
+ Send a follow-up to an active task. Only works on non-terminal tasks.
103
119
 
104
- | variable | description | default |
120
+ | Parameter | Type | Required |
105
121
  |---|---|---|
106
- | `CODEX_APP_SERVER_COMMAND` | codex binary path | `codex` |
107
- | `CODEX_APP_SERVER_ARGS` | app-server arguments | `app-server --listen stdio://` |
108
- | `CODEX_HOME_DIRS` | colon-separated profile roots for failover | `~/.codex` |
109
- | `CODEX_ENABLE_FLEET` | enable fleet mode (appends sub-agent instructions) | off |
122
+ | `task_id` | string | yes |
123
+ | `message` | string | yes |
124
+ | `model` | string | no |
110
125
 
111
- ## typical workflow
126
+ ### cancel-task
127
+
128
+ Cancel one or many tasks.
129
+
130
+ | Parameter | Type | Required |
131
+ |---|---|---|
132
+ | `task_id` | string or string[] | yes |
133
+
134
+ Returns: `{ cancelled: [...], already_terminal: [...], not_found: [...] }`
135
+
136
+ ## Task resources
137
+
138
+ | URI | Description |
139
+ |---|---|
140
+ | `task:///all` | Scoreboard — all tasks with status badges and elapsed time |
141
+ | `task:///{id}` | Detail — metadata, provider session, timestamps, error |
142
+ | `task:///{id}/log` | Summary log — last 20 output lines |
143
+ | `task:///{id}/log.verbose` | Verbose log — full output history |
144
+
145
+ ### Wire states (SEP-1686)
146
+
147
+ All statuses returned by tools use these 7 values:
148
+
149
+ | State | Meaning |
150
+ |---|---|
151
+ | `submitted` | Queued, not started |
152
+ | `working` | Agent is executing |
153
+ | `input_required` | Paused, needs response |
154
+ | `completed` | Done |
155
+ | `failed` | Error |
156
+ | `cancelled` | Interrupted |
157
+ | `unknown` | Crash recovery fallback |
158
+
159
+ ## Parallel execution
160
+
161
+ Spawn multiple tasks simultaneously. Each runs in an independent agent workspace.
112
162
 
113
163
  ```
114
- 1. thread-start get thread_id
115
- 2. turn-start(thread_id, prompt) agent starts working
116
- 3. wait(thread_id=...) → wait for completion or request
117
- 4. request-list → check if agent needs approval
118
- 5. request-respond(request_id) → approve and resume
119
- 6. thread-read(thread_id) → read final results
164
+ spawn-task(prompt: "implement auth module", cwd: "/project") task_a
165
+ spawn-task(prompt: "implement billing module", cwd: "/project") task_b
166
+ spawn-task(prompt: "write e2e tests", cwd: "/project") → task_c
167
+
168
+ # Monitor via scoreboard
169
+ read resource: task:///all
170
+ → tasks -- 3 total (1 done, 2 busy)
171
+
172
+ # Wait for each
173
+ wait-task(task_a) → completed
174
+ wait-task(task_b) → completed
175
+ wait-task(task_c) → completed
120
176
  ```
121
177
 
122
- ## local development
178
+ ## Environment variables
179
+
180
+ | Variable | Description | Default |
181
+ |---|---|---|
182
+ | `CODEX_APP_SERVER_COMMAND` | Codex binary path | `codex` |
183
+ | `CODEX_APP_SERVER_ARGS` | App-server arguments | `app-server --listen stdio://` |
184
+ | `CODEX_HOME_DIRS` | Colon-separated profile roots for failover | `~/.codex` |
185
+ | `CODEX_ENABLE_FLEET` | Enable fleet mode (sub-agent instructions) | off |
186
+
187
+ ## Local development
123
188
 
124
189
  ```bash
125
190
  npm install
126
191
  npm run build
127
- npm run test:unit
128
- npm run smoke # requires codex CLI
192
+ npm run test:unit # 158 tests
193
+ npm run smoke # requires codex CLI
129
194
  ```
130
195
 
131
- ## troubleshooting
196
+ ### Contract tests (mcpc)
197
+
198
+ ```bash
199
+ ./test/mcpc/gherkin-tests.sh # 45 scenarios, 84 assertions
200
+ ```
132
201
 
133
- - make sure `codex` CLI is installed and authenticated
134
- - check `CODEX_APP_SERVER_COMMAND` if using a non-standard install path
135
- - use `account-rate-limits-read` before launching many parallel threads
202
+ Requires [mcpc](https://github.com/nicobailey/mcpc) v0.1.11+.
package/dist/src/app.d.ts CHANGED
@@ -20,19 +20,9 @@ export declare class CodexWorkerApp {
20
20
  text: string;
21
21
  }>;
22
22
  callTool(name: string, args: unknown): Promise<string>;
23
- private handleThreadStart;
24
- private handleThreadResume;
25
- private handleThreadRead;
26
- private handleThreadList;
27
- private handleTurnStart;
28
- private handleTurnSteer;
29
- private handleTurnInterrupt;
30
- private handleRequestRespond;
31
- private handleWait;
32
23
  private handleSpawnTask;
33
24
  private handleWaitTask;
34
25
  private handleRespondTask;
35
26
  private handleMessageTask;
36
27
  private handleCancelTask;
37
- private buildServerRequestPayload;
38
28
  }