llm-cli-gateway 1.11.0 → 1.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,251 @@
2
2
 
3
3
  All notable changes to the llm-cli-gateway project.
4
4
 
5
+ ## [1.13.0] - 2026-05-27 — Phase 4 slice θ (Grok HIGH parity)
6
+
7
+ Ships the eighth Phase 4 slice: five HIGH-impact Grok CLI flags are now
8
+ reachable from `grok_request` and `grok_request_async`. Grok was the
9
+ most under-wired provider per the 2026-05-27 audit; this slice closes
10
+ the HIGH-severity gap in a single bundled PR. Three commits land
11
+ together (feature wiring, contract registration, test-veracity
12
+ regressions) plus this release commit.
13
+
14
+ ### Added — five HIGH-impact Grok flags
15
+
16
+ - **`sandbox`** → `--sandbox <PROFILE>`. Freeform passthrough per
17
+ `grok --help` on 0.1.210 (no `[possible values: …]` listing, unlike
18
+ `--effort` / `--permission-mode` / `--output-format` which all
19
+ enumerate). Also settable via the `GROK_SANDBOX` env var. Caller
20
+ responsibility to pass a valid profile name. The slice deliberately
21
+ does **not** integrate `--sandbox` with `approvalStrategy:
22
+ "mcp_managed"` because the value is unbounded — Grok's approval
23
+ semantics are already covered by `permissionMode` + `alwaysApprove` +
24
+ `approvalStrategy`.
25
+ - **`rules`** → `--rules <RULES>`. Supports `@file` prefix per
26
+ `grok --help` to load from a file; the gateway passes the value
27
+ verbatim and lets Grok parse the prefix. Bounded via
28
+ `z.string().min(1)`.
29
+ - **`systemPromptOverride`** → `--system-prompt-override <PROMPT>`.
30
+ Distinct from Claude's `--system-prompt` / `--append-system-prompt`
31
+ (Grok has only one override flag, not a pair). Bounded via
32
+ `z.string().min(1)`.
33
+ - **`allow`** → `--allow <RULE>` (repeatable). Each array entry is
34
+ emitted as its own `--allow` argv instance per `grok --help`
35
+ ("Repeat to add multiple rules"). NOT comma-joined like the existing
36
+ `--tools` / `--disallowed-tools` Grok wiring.
37
+ - **`deny`** → `--deny <RULE>` (repeatable). Same semantics as `allow`.
38
+
39
+ All five flags surfaced on both `grok_request` and `grok_request_async`
40
+ (slice δ sync+async parity invariant). Threaded from MCP-side Zod
41
+ through `GrokRequestParams` → `handleGrokRequest` /
42
+ `handleGrokRequestAsync` → `prepareGrokRequest` argv emission.
43
+
44
+ ### Contract surface
45
+
46
+ `UPSTREAM_CLI_CONTRACTS.grok` updates:
47
+
48
+ - `flags["--sandbox"]` (arity:"one"; **NO `values` enum** per live
49
+ `grok --help` — `--sandbox` is freeform, unlike Codex's
50
+ read-only/workspace-write/danger-full-access enum).
51
+ - `flags["--rules"]` (arity:"one").
52
+ - `flags["--system-prompt-override"]` (arity:"one").
53
+ - `flags["--allow"]` (arity:"one"; multiple instances accepted because
54
+ `arity:"one"` means "consumes one value per instance" not "max one
55
+ instance").
56
+ - `flags["--deny"]` (arity:"one"; same).
57
+ - `mcpParameters` array updated with five new entries.
58
+ - Five new passing conformance fixtures (`grok-sandbox`, `grok-rules`,
59
+ `grok-system-prompt-override`, `grok-allow-repeated`,
60
+ `grok-deny-repeated`); each is mechanically validated against
61
+ `validateUpstreamCliArgs` in the REGRESSIONS Tε suite, closing the
62
+ fixture-existence-vs-mechanical-validation gap identified in slice ε
63
+ round 1.
64
+
65
+ ### Out of scope
66
+
67
+ - **Approval-manager integration for `--sandbox`** — explicitly
68
+ deferred. Grok's sandbox value is freeform per the live CLI surface;
69
+ integrating it with the approval manager (as Codex does for its
70
+ bounded enum) would require either (a) hardcoding an allowlist of
71
+ profile names in the gateway, or (b) a different security model
72
+ where the caller asserts the profile is "safe enough". Neither is
73
+ obvious from current Grok docs. Revisit when Grok ships an enum or
74
+ publishes a sandbox-profile taxonomy.
75
+
76
+ ### Test-veracity audit
77
+
78
+ Per the standing protocol
79
+ (`feedback_test_veracity_audit_protocol`), this slice's tests were
80
+ audited by four LLM reviewers (Codex, Grok, Mistral, Claude) in async
81
+ parallel with mandatory mutation-probe execution against
82
+ `docs/plans/test-veracity-audit-slice-theta.spec.md`.
83
+
84
+ **Round 1 outcomes:**
85
+
86
+ - Codex: UNCONDITIONAL APPROVE — all 12 probes [as predicted], all
87
+ 26 tests VERIFIED. Baseline (`npm test`: 55 files / 884 tests; build
88
+ + format:check clean; slice file 31/31).
89
+ - Grok: UNCONDITIONAL APPROVE — all 12 probes [as predicted]; ran in
90
+ an isolated worktree at `/tmp/theta-audit-grok` per the slice-ζ
91
+ reviewer-stomping lesson.
92
+ - Mistral: UNCONDITIONAL APPROVE — all 12 probes [as predicted].
93
+ - Claude: UNCONDITIONAL APPROVE — all 12 probes [as predicted]; noted
94
+ the extra Tε-2 test (custom-profile freeform regression probe) goes
95
+ beyond the spec and closes the "enum-mistake stays silent if fixture
96
+ uses a listed value" gap.
97
+ - Gemini: **FAILED at 10s** with `TerminalQuotaError: You have
98
+ exhausted your capacity on this model. Your quota will reset after
99
+ 52m10s.` (Google 429). Documented quota blocker per protocol clause
100
+ 5+6 — counts as "concrete unfixable when documented". Four
101
+ substantive valid approves from independent vendor families (OpenAI,
102
+ xAI, Mistral, Anthropic) satisfy the gate.
103
+
104
+ The 31 new tests (853 → 884 total) cover every new field/flag/fixture
105
+ across REGRESSIONS Tα/β/ε:
106
+
107
+ - **Tα** — Registered tool inputSchema for every new field on both
108
+ sync and async tools, including `.min(1)` empty-string rejection on
109
+ the three string fields (sandbox, rules, systemPromptOverride).
110
+ - **Tβ** — `prepareGrokRequest` end-to-end argv emission per flag.
111
+ Explicit "repeated `--allow`/`--deny` instances, NOT comma-joined
112
+ like `--tools`" assertions catch the comma-join regression class. An
113
+ "@file prefix passes through verbatim" assertion catches a "helpful
114
+ preprocessor" regression. Prepare → contract end-to-end via
115
+ `validateUpstreamCliArgs` (REGRESSIONS D pattern; closes the slice
116
+ α/γ/δ contract-table gap class).
117
+ - **Tε** — `UPSTREAM_CLI_CONTRACTS` introspection + mechanical fixture
118
+ validation in the same `it()` block. Explicit assertion that
119
+ `--sandbox` has **no `values` enum** (catches the "freeform vs enum"
120
+ regression that an over-zealous future contributor might introduce).
121
+ Extra Tε-2 probe asserts a non-standard sandbox profile passes
122
+ `validateUpstreamCliArgs`.
123
+
124
+ ### Mechanical anchors (verify with `rg` before relying)
125
+
126
+ - `src/index.ts` — `prepareGrokRequest` signature gains five fields
127
+ (`:1968-1995`), emission block (`:2088-2110`), `GrokRequestParams`
128
+ interface (`:2819-2829`), `handleGrokRequest` threading
129
+ (`:2854-2858`), `handleGrokRequestAsync` threading (`:3041-3045`),
130
+ sync `grok_request` Zod registration (`:4890-4922`), async
131
+ `grok_request_async` Zod registration (`:5906-5938`).
132
+ - `src/upstream-contracts.ts` — `grok.mcpParameters` (`:459-463`),
133
+ `grok.flags` entries (`:501-524`), conformance fixtures
134
+ (`:559-587`).
135
+
136
+ ## [1.12.0] - 2026-05-27 — Phase 4 slice ζ (working-dir + add-dir cross-provider)
137
+
138
+ Ships the seventh Phase 4 slice: working-directory and additional-directory
139
+ flags are now reachable across four CLIs in a single bundled PR. Three
140
+ commits land together (feature wiring, contract registration, test-veracity
141
+ regressions) plus this release commit.
142
+
143
+ ### Added — working-dir + add-dir parity for four CLIs
144
+
145
+ - **Claude** — `claude_request` and `claude_request_async` accept a new
146
+ `addDir: string[]` field. Threaded through `prepareClaudeRequest` →
147
+ `prepareClaudeHighImpactFlags` (`src/request-helpers.ts:687`). Each
148
+ entry emits its own `--add-dir` instance per `claude --help` ("Additional
149
+ directories to allow tool access to"). Claude has no working-dir flag
150
+ (uses the process cwd).
151
+ - **Codex** — `codex_request` and `codex_request_async` accept new
152
+ `workingDir: string` (min 1) and `addDir: string[]` fields. Both flags
153
+ are already in `CODEX_RESUME_FILTERED_FLAGS` (the original session's cwd
154
+ and writable-dir policy are inherited on resume), so `prepareCodexRequest`
155
+ gates emission on `sessionPlan.mode === "new"` — resume argv stays clean
156
+ rather than emitting then stripping. Emits `-C <DIR>` (one) and
157
+ `--add-dir <DIR>` (one instance per entry).
158
+ - **Grok** — `grok_request` and `grok_request_async` accept a new
159
+ `workingDir: string` (min 1) field. `prepareGrokRequest` emits
160
+ `--cwd <DIR>`. Grok has no `--add-dir` analogue.
161
+ - **Vibe (Mistral)** — `mistral_request` and `mistral_request_async`
162
+ accept new `workingDir: string` (min 1) and `addDir: string[]` fields.
163
+ `prepareMistralRequest` (the `request-helpers.ts` helper) emits
164
+ `--workdir <DIR>` (one) and `--add-dir <DIR>` (one per entry; Vibe's
165
+ `--help` states the flag "Can be specified multiple times").
166
+ `buildMistralRetryPrep` threads both fields through to the stale-model
167
+ recovery argv per the slice-δ retry-path invariant.
168
+ - **Gemini** is not re-wired: `--include-directories` was wired in master
169
+ before this slice. A regression-guard test in REGRESSIONS Zε asserts
170
+ the existing wiring stays intact while adjacent contract entries
171
+ changed.
172
+
173
+ ### Out of scope — worktree flags
174
+
175
+ Worktree flags (`-w/--worktree` on Claude, Gemini, Grok) create new git
176
+ worktree directories on disk with lifecycle implications and are
177
+ explicitly deferred to a later slice with explicit cleanup semantics.
178
+
179
+ ### Contract surface
180
+
181
+ `UPSTREAM_CLI_CONTRACTS` updates:
182
+
183
+ - `claude.flags["--add-dir"]` (arity:"one"; repeated instances accepted)
184
+ - `codex.flags["-C"]` (the gateway only emits the short form; codex
185
+ 0.134.0 accepts `--cd` as an alias but the contract registers exactly
186
+ what we emit — a future code path that emitted `--cd` would correctly
187
+ fail the contract check).
188
+ - `codex.flags["--add-dir"]`
189
+ - `grok.flags["--cwd"]`
190
+ - `mistral.flags["--workdir"]`
191
+ - `mistral.flags["--add-dir"]`
192
+ - `mcpParameters` arrays updated for all four CLIs.
193
+ - Six new passing conformance fixtures (`claude-add-dir`,
194
+ `codex-working-dir`, `codex-add-dir`, `grok-working-dir`,
195
+ `mistral-working-dir`, `mistral-add-dir`); each is mechanically
196
+ validated against `validateUpstreamCliArgs` in the REGRESSIONS Zε
197
+ suite, closing the gap class identified in slice ε round 1.
198
+
199
+ ### Test-veracity audit
200
+
201
+ Per the standing protocol (`feedback_test_veracity_audit_protocol`),
202
+ this slice's tests were audited by all five LLM reviewers (Codex,
203
+ Gemini, Grok, Mistral, Claude) in async parallel with mandatory
204
+ mutation-probe execution against `docs/plans/test-veracity-audit-slice-zeta.spec.md`.
205
+
206
+ **Round 1 outcomes:**
207
+
208
+ - Codex: UNCONDITIONAL APPROVE — all 13 probes [as predicted], all 37
209
+ tests VERIFIED. Baseline (`npx vitest run` on the slice file: 37/37;
210
+ `npm test`: 54 files / 853 tests; build + format:check clean).
211
+ - Grok: UNCONDITIONAL APPROVE — all 13 probes [as predicted].
212
+ - Mistral: UNCONDITIONAL APPROVE — all 13 probes [as predicted].
213
+ - Claude: UNCONDITIONAL APPROVE — all 13 probes red as predicted; ran
214
+ in an isolated `/tmp/zeta-audit-claude` worktree because the four
215
+ parallel reviewers were concurrently mutating the live tree.
216
+ - Gemini: UNCONDITIONAL APPROVE — all 13 probes [as predicted].
217
+
218
+ First unanimous round-1 pass on a multi-CLI slice. The 37 new tests
219
+ (816 → 853 total) cover every new field/flag/fixture across REGRESSIONS
220
+ Zα/β/ε:
221
+
222
+ - **Zα** — Registered tool inputSchema for every new field on every
223
+ tool (sync + async), including `.min(1)` empty-string rejection on
224
+ `workingDir`.
225
+ - **Zβ** — `prepare*Request` end-to-end argv emission per CLI. The
226
+ Codex resume branch asserts NEITHER `-C` NOR `--add-dir` appears
227
+ in resume argv. `buildMistralRetryPrep` regression catches the
228
+ slice-δ retry-path bug class. Prepare → contract end-to-end
229
+ consistency covers all four CLIs.
230
+ - **Zε** — `UPSTREAM_CLI_CONTRACTS` introspection + mechanical
231
+ fixture validation in the same `it()` block (slice-ε round-1 gap
232
+ class). Includes a regression guard for the pre-existing Gemini
233
+ `--include-directories` wiring.
234
+
235
+ ### Mechanical anchors (verify with `rg` before relying)
236
+
237
+ - `src/request-helpers.ts` — `ClaudeHighImpactFlagsInput.addDir`
238
+ (`:610`), `prepareClaudeHighImpactFlags` emission (`:686-690`).
239
+ `PrepareMistralRequestInput.workingDir`/`.addDir` (`:248-264`),
240
+ `prepareMistralRequest` emission (`:300-307`).
241
+ - `src/index.ts` — `prepareClaudeRequest` (`:1338`),
242
+ `prepareCodexRequest` new-session gate (`:1687-1700`),
243
+ `prepareGrokRequest` `--cwd` emission (`:2065-2067`),
244
+ `prepareMistralRequest` wrapper (`:2153-2168`),
245
+ `buildMistralRetryPrep` (`:2249-2289`).
246
+ - `src/upstream-contracts.ts` — flag registrations and conformance
247
+ fixtures for the four CLIs (`:146-149`, `:281-292`, `:438-441`,
248
+ `:524-533`, plus `mcpParameters` entries).
249
+
5
250
  ## [1.11.0] - 2026-05-27 — Phase 4 slice η (Claude `--fallback-model` + `--json-schema`)
6
251
 
7
252
  Ships the sixth Phase 4 slice: Claude's reliability fallback and
package/dist/index.d.ts CHANGED
@@ -157,6 +157,7 @@ export declare function prepareClaudeRequest(params: {
157
157
  excludeDynamicSystemPromptSections?: boolean;
158
158
  fallbackModel?: string;
159
159
  jsonSchema?: string | Record<string, unknown>;
160
+ addDir?: string[];
160
161
  }, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
161
162
  export interface CodexRequestPrep extends CliRequestPrep {
162
163
  /**
@@ -199,6 +200,8 @@ export declare function prepareCodexRequest(params: {
199
200
  images?: string[];
200
201
  ignoreUserConfig?: boolean;
201
202
  ignoreRules?: boolean;
203
+ workingDir?: string;
204
+ addDir?: string[];
202
205
  }, runtime?: GatewayServerRuntime): CodexRequestPrep | ExtendedToolResponse;
203
206
  export declare function prepareGeminiRequest(params: {
204
207
  prompt?: string;
@@ -254,6 +257,31 @@ export declare function prepareGrokRequest(params: {
254
257
  * iterations for cost / latency control. Mirrors Claude's wiring.
255
258
  */
256
259
  maxTurns?: number;
260
+ /**
261
+ * Phase 4 slice ζ: emit `--cwd <DIR>` so headless callers can set Grok's
262
+ * working directory without depending on the gateway process's cwd.
263
+ */
264
+ workingDir?: string;
265
+ /**
266
+ * Phase 4 slice θ — Grok HIGH parity. All five are passthrough flags:
267
+ *
268
+ * - `sandbox` → `--sandbox <PROFILE>` (freeform; Grok 0.1.210 --help
269
+ * shows no enum constraint, unlike --effort / --permission-mode /
270
+ * --output-format which all show `[possible values: …]`).
271
+ * - `rules` → `--rules <RULES>`. Supports `@file` prefix; gateway
272
+ * passes the value verbatim and lets Grok parse it.
273
+ * - `systemPromptOverride` → `--system-prompt-override <PROMPT>`.
274
+ * Distinct from Claude's --system-prompt / --append-system-prompt
275
+ * (Grok has only one override flag).
276
+ * - `allow` / `deny` → repeatable `--allow <RULE>` / `--deny <RULE>`
277
+ * per --help ("Repeat to add multiple rules"). One argv pair per
278
+ * entry — NOT comma-joined like --tools / --disallowed-tools.
279
+ */
280
+ sandbox?: string;
281
+ rules?: string;
282
+ systemPromptOverride?: string;
283
+ allow?: string[];
284
+ deny?: string[];
257
285
  }, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
258
286
  export declare function prepareMistralRequest(params: {
259
287
  prompt?: string;
@@ -280,6 +308,10 @@ export declare function prepareMistralRequest(params: {
280
308
  maxTurns?: number;
281
309
  /** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
282
310
  maxPrice?: number;
311
+ /** Phase 4 slice ζ: Vibe `--workdir <DIR>` working-directory parity. */
312
+ workingDir?: string;
313
+ /** Phase 4 slice ζ: Vibe `--add-dir <DIR>` repeatable add-dir parity. */
314
+ addDir?: string[];
283
315
  }, runtime?: GatewayServerRuntime): (CliRequestPrep & {
284
316
  mistralEnv: Record<string, string>;
285
317
  }) | ExtendedToolResponse;
@@ -292,7 +324,7 @@ export declare function prepareMistralRequest(params: {
292
324
  * through here, or a fresh-workspace / budgeted run can degrade on
293
325
  * the second attempt.
294
326
  */
295
- export declare function buildMistralRetryPrep(params: Pick<MistralRequestParams, "outputFormat" | "permissionMode" | "effort" | "reasoningEffort" | "allowedTools" | "disallowedTools" | "approvalStrategy" | "trust" | "maxTurns" | "maxPrice"> & {
327
+ export declare function buildMistralRetryPrep(params: Pick<MistralRequestParams, "outputFormat" | "permissionMode" | "effort" | "reasoningEffort" | "allowedTools" | "disallowedTools" | "approvalStrategy" | "trust" | "maxTurns" | "maxPrice" | "workingDir" | "addDir"> & {
296
328
  effectivePrompt: string;
297
329
  }, recoveryModel: string): {
298
330
  args: string[];
@@ -368,6 +400,18 @@ export interface GrokRequestParams {
368
400
  forceRefresh?: boolean;
369
401
  /** Phase 4 slice δ: cap agent-loop iterations via `--max-turns N`. */
370
402
  maxTurns?: number;
403
+ /** Phase 4 slice ζ: emit `--cwd <DIR>` so the CLI uses the specified working directory. */
404
+ workingDir?: string;
405
+ /** Phase 4 slice θ: Grok `--sandbox <PROFILE>` (freeform passthrough). */
406
+ sandbox?: string;
407
+ /** Phase 4 slice θ: Grok `--rules <RULES>` (supports `@file` prefix; verbatim passthrough). */
408
+ rules?: string;
409
+ /** Phase 4 slice θ: Grok `--system-prompt-override <PROMPT>`. */
410
+ systemPromptOverride?: string;
411
+ /** Phase 4 slice θ: Grok `--allow <RULE>` (repeatable; one entry per --allow instance). */
412
+ allow?: string[];
413
+ /** Phase 4 slice θ: Grok `--deny <RULE>` (repeatable; one entry per --deny instance). */
414
+ deny?: string[];
371
415
  }
372
416
  export declare function handleGrokRequest(deps: HandlerDeps, params: GrokRequestParams): Promise<ExtendedToolResponse>;
373
417
  export declare function handleGrokRequestAsync(deps: AsyncHandlerDeps, params: Omit<GrokRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;
@@ -398,6 +442,10 @@ export interface MistralRequestParams {
398
442
  maxTurns?: number;
399
443
  /** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
400
444
  maxPrice?: number;
445
+ /** Phase 4 slice ζ: Vibe `--workdir <DIR>` working-directory parity. */
446
+ workingDir?: string;
447
+ /** Phase 4 slice ζ: Vibe `--add-dir <DIR>` repeatable add-dir parity. */
448
+ addDir?: string[];
401
449
  }
402
450
  export declare function handleMistralRequest(deps: HandlerDeps, params: MistralRequestParams): Promise<ExtendedToolResponse>;
403
451
  export declare function handleMistralRequestAsync(deps: AsyncHandlerDeps, params: Omit<MistralRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;
@@ -430,6 +478,8 @@ export declare function handleCodexRequestAsync(deps: AsyncHandlerDeps, params:
430
478
  images?: string[];
431
479
  ignoreUserConfig?: boolean;
432
480
  ignoreRules?: boolean;
481
+ workingDir?: string;
482
+ addDir?: string[];
433
483
  }): Promise<ExtendedToolResponse>;
434
484
  export declare function createGatewayServer(deps?: GatewayServerDeps): McpServer;
435
485
  export {};
package/dist/index.js CHANGED
@@ -1007,6 +1007,7 @@ export function prepareClaudeRequest(params, runtime = resolveGatewayServerRunti
1007
1007
  excludeDynamicSystemPromptSections: params.excludeDynamicSystemPromptSections,
1008
1008
  fallbackModel: params.fallbackModel,
1009
1009
  jsonSchema: params.jsonSchema,
1010
+ addDir: params.addDir,
1010
1011
  }));
1011
1012
  return {
1012
1013
  corrId,
@@ -1126,6 +1127,19 @@ export function prepareCodexRequest(params, runtime = resolveGatewayServerRuntim
1126
1127
  // and are emitted in both branches.
1127
1128
  let highImpactCleanup;
1128
1129
  if (sessionPlan.mode === "new") {
1130
+ // Phase 4 slice ζ: emit working-dir and add-dir on new sessions only.
1131
+ // Both flags are listed in CODEX_RESUME_FILTERED_FLAGS — resume inherits
1132
+ // the original session's cwd and writable-dir policy, so emitting them
1133
+ // on resume would be silently stripped (wasteful + misleading on argv
1134
+ // logs). Gating here mirrors `--search` / `--sandbox` / `--full-auto`.
1135
+ if (params.workingDir) {
1136
+ args.push("-C", params.workingDir);
1137
+ }
1138
+ if (params.addDir && params.addDir.length > 0) {
1139
+ for (const dir of params.addDir) {
1140
+ args.push("--add-dir", dir);
1141
+ }
1142
+ }
1129
1143
  const high = prepareCodexHighImpactFlags({
1130
1144
  outputSchema: params.outputSchema,
1131
1145
  search: params.search,
@@ -1381,6 +1395,28 @@ export function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime
1381
1395
  if (params.maxTurns !== undefined) {
1382
1396
  args.push("--max-turns", String(params.maxTurns));
1383
1397
  }
1398
+ if (params.workingDir) {
1399
+ args.push("--cwd", params.workingDir);
1400
+ }
1401
+ if (params.sandbox) {
1402
+ args.push("--sandbox", params.sandbox);
1403
+ }
1404
+ if (params.rules) {
1405
+ args.push("--rules", params.rules);
1406
+ }
1407
+ if (params.systemPromptOverride) {
1408
+ args.push("--system-prompt-override", params.systemPromptOverride);
1409
+ }
1410
+ if (params.allow && params.allow.length > 0) {
1411
+ for (const rule of params.allow) {
1412
+ args.push("--allow", rule);
1413
+ }
1414
+ }
1415
+ if (params.deny && params.deny.length > 0) {
1416
+ for (const rule of params.deny) {
1417
+ args.push("--deny", rule);
1418
+ }
1419
+ }
1384
1420
  return {
1385
1421
  corrId,
1386
1422
  effectivePrompt,
@@ -1467,6 +1503,8 @@ export function prepareMistralRequest(params, runtime = resolveGatewayServerRunt
1467
1503
  trust: params.trust,
1468
1504
  maxTurns: params.maxTurns,
1469
1505
  maxPrice: params.maxPrice,
1506
+ workingDir: params.workingDir,
1507
+ addDir: params.addDir,
1470
1508
  });
1471
1509
  if (prep.ignoredDisallowedTools) {
1472
1510
  runtime.logger.info(`[${corrId}] Mistral does not support disallowedTools; ignoring (caller passed ${params.disallowedTools?.length ?? 0} entries)`);
@@ -1521,6 +1559,8 @@ export function buildMistralRetryPrep(params, recoveryModel) {
1521
1559
  trust: params.trust,
1522
1560
  maxTurns: params.maxTurns,
1523
1561
  maxPrice: params.maxPrice,
1562
+ workingDir: params.workingDir,
1563
+ addDir: params.addDir,
1524
1564
  });
1525
1565
  }
1526
1566
  function buildCliResponse(cli, stdout, optimizeResponse, corrId, sessionId, prep, durationMs, resumable, outputFormat, warnings) {
@@ -1862,6 +1902,12 @@ export async function handleGrokRequest(deps, params) {
1862
1902
  optimizePrompt: params.optimizePrompt,
1863
1903
  operation: "grok_request",
1864
1904
  maxTurns: params.maxTurns,
1905
+ workingDir: params.workingDir,
1906
+ sandbox: params.sandbox,
1907
+ rules: params.rules,
1908
+ systemPromptOverride: params.systemPromptOverride,
1909
+ allow: params.allow,
1910
+ deny: params.deny,
1865
1911
  }, runtime);
1866
1912
  if (!("args" in prep))
1867
1913
  return prep;
@@ -1983,6 +2029,12 @@ export async function handleGrokRequestAsync(deps, params) {
1983
2029
  optimizePrompt: params.optimizePrompt,
1984
2030
  operation: "grok_request_async",
1985
2031
  maxTurns: params.maxTurns,
2032
+ workingDir: params.workingDir,
2033
+ sandbox: params.sandbox,
2034
+ rules: params.rules,
2035
+ systemPromptOverride: params.systemPromptOverride,
2036
+ allow: params.allow,
2037
+ deny: params.deny,
1986
2038
  }, runtime);
1987
2039
  if (!("args" in prep))
1988
2040
  return prep;
@@ -2067,6 +2119,8 @@ export async function handleMistralRequest(deps, params) {
2067
2119
  trust: params.trust,
2068
2120
  maxTurns: params.maxTurns,
2069
2121
  maxPrice: params.maxPrice,
2122
+ workingDir: params.workingDir,
2123
+ addDir: params.addDir,
2070
2124
  }, runtime);
2071
2125
  if (!("args" in prep))
2072
2126
  return prep;
@@ -2202,6 +2256,8 @@ export async function handleMistralRequestAsync(deps, params) {
2202
2256
  trust: params.trust,
2203
2257
  maxTurns: params.maxTurns,
2204
2258
  maxPrice: params.maxPrice,
2259
+ workingDir: params.workingDir,
2260
+ addDir: params.addDir,
2205
2261
  }, runtime);
2206
2262
  if (!("args" in prep))
2207
2263
  return prep;
@@ -2290,6 +2346,8 @@ export async function handleCodexRequestAsync(deps, params) {
2290
2346
  images: params.images,
2291
2347
  ignoreUserConfig: params.ignoreUserConfig,
2292
2348
  ignoreRules: params.ignoreRules,
2349
+ workingDir: params.workingDir,
2350
+ addDir: params.addDir,
2293
2351
  }, runtime);
2294
2352
  if (!("args" in prep))
2295
2353
  return prep;
@@ -2493,6 +2551,11 @@ export function createGatewayServer(deps = {}) {
2493
2551
  .union([z.string(), z.record(z.unknown())])
2494
2552
  .optional()
2495
2553
  .describe("Claude --json-schema: JSON Schema literal (NOT a path) constraining structured output. Object values are JSON.stringify-d; string values are passed verbatim. Use with outputFormat='json'."),
2554
+ // Phase 4 slice ζ — Claude additional-workspace-dirs parity
2555
+ addDir: z
2556
+ .array(z.string())
2557
+ .optional()
2558
+ .describe("Claude --add-dir: additional directories the CLI is allowed to read/write beyond the process cwd. Each entry is emitted as its own --add-dir instance."),
2496
2559
  approvalStrategy: z
2497
2560
  .enum(["legacy", "mcp_managed"])
2498
2561
  .default("legacy")
@@ -2523,7 +2586,7 @@ export function createGatewayServer(deps = {}) {
2523
2586
  .boolean()
2524
2587
  .default(false)
2525
2588
  .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
2526
- }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
2589
+ }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, addDir, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
2527
2590
  const startTime = Date.now();
2528
2591
  if (systemPrompt !== undefined && appendSystemPrompt !== undefined) {
2529
2592
  return createErrorResponse("claude", 1, "", correlationId, new Error("systemPrompt and appendSystemPrompt are mutually exclusive; use one or the other (not both)."));
@@ -2555,6 +2618,7 @@ export function createGatewayServer(deps = {}) {
2555
2618
  excludeDynamicSystemPromptSections,
2556
2619
  fallbackModel,
2557
2620
  jsonSchema,
2621
+ addDir,
2558
2622
  }, runtime);
2559
2623
  if (!("args" in prep))
2560
2624
  return prep;
@@ -2809,7 +2873,17 @@ export function createGatewayServer(deps = {}) {
2809
2873
  .boolean()
2810
2874
  .optional()
2811
2875
  .describe("Codex --ignore-rules: skip project rule files for this run."),
2812
- }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, }) => {
2876
+ // Phase 4 slice ζ Codex working-dir + add-dir parity (new sessions only).
2877
+ workingDir: z
2878
+ .string()
2879
+ .min(1)
2880
+ .optional()
2881
+ .describe("Codex -C/--cd <DIR>: working root for this session. Emitted on new sessions only; resume inherits the original session's cwd via CODEX_RESUME_FILTERED_FLAGS."),
2882
+ addDir: z
2883
+ .array(z.string())
2884
+ .optional()
2885
+ .describe("Codex --add-dir <DIR>: additional writable workspace directories. Emitted once per entry on new sessions only; resume inherits the original session's writable-dir policy."),
2886
+ }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, workingDir, addDir, }) => {
2813
2887
  const startTime = Date.now();
2814
2888
  const prep = prepareCodexRequest({
2815
2889
  prompt,
@@ -2838,6 +2912,8 @@ export function createGatewayServer(deps = {}) {
2838
2912
  images,
2839
2913
  ignoreUserConfig,
2840
2914
  ignoreRules,
2915
+ workingDir,
2916
+ addDir,
2841
2917
  }, runtime);
2842
2918
  if (!("args" in prep))
2843
2919
  return prep;
@@ -3209,7 +3285,37 @@ export function createGatewayServer(deps = {}) {
3209
3285
  .default(false)
3210
3286
  .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
3211
3287
  maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
3212
- }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, maxTurns, }) => {
3288
+ // Phase 4 slice ζ Grok working-directory parity.
3289
+ workingDir: z
3290
+ .string()
3291
+ .min(1)
3292
+ .optional()
3293
+ .describe("Grok --cwd <DIR>: working directory for this invocation. Lets headless callers run Grok against a directory other than the gateway process's cwd."),
3294
+ // Phase 4 slice θ — Grok HIGH parity (sandbox, rules, system-prompt-override, allow, deny).
3295
+ sandbox: z
3296
+ .string()
3297
+ .min(1)
3298
+ .optional()
3299
+ .describe("Grok --sandbox <PROFILE>: sandbox profile for filesystem and network access. Freeform per `grok --help` (no enum constraint on Grok 0.1.210); also settable via GROK_SANDBOX env var. Caller responsibility to pass a valid profile name."),
3300
+ rules: z
3301
+ .string()
3302
+ .min(1)
3303
+ .optional()
3304
+ .describe("Grok --rules <RULES>: extra rules to append to the system prompt. Supports `@file` prefix per `grok --help` to load from a file; gateway passes the value verbatim and lets Grok parse the prefix."),
3305
+ systemPromptOverride: z
3306
+ .string()
3307
+ .min(1)
3308
+ .optional()
3309
+ .describe("Grok --system-prompt-override <PROMPT>: replace the agent's system prompt entirely. Distinct from Claude's --system-prompt / --append-system-prompt (Grok has only one override flag, not a pair)."),
3310
+ allow: z
3311
+ .array(z.string())
3312
+ .optional()
3313
+ .describe('Grok --allow <RULE>: permission allow rules. Each entry is emitted as its own --allow instance (per `grok --help`: "Repeat to add multiple rules").'),
3314
+ deny: z
3315
+ .array(z.string())
3316
+ .optional()
3317
+ .describe('Grok --deny <RULE>: permission deny rules. Each entry is emitted as its own --deny instance (per `grok --help`: "Repeat to add multiple rules").'),
3318
+ }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, maxTurns, workingDir, sandbox, rules, systemPromptOverride, allow, deny, }) => {
3213
3319
  return handleGrokRequest({ sessionManager, logger, runtime }, {
3214
3320
  prompt,
3215
3321
  promptParts,
@@ -3233,6 +3339,12 @@ export function createGatewayServer(deps = {}) {
3233
3339
  idleTimeoutMs,
3234
3340
  forceRefresh,
3235
3341
  maxTurns,
3342
+ workingDir,
3343
+ sandbox,
3344
+ rules,
3345
+ systemPromptOverride,
3346
+ allow,
3347
+ deny,
3236
3348
  });
3237
3349
  });
3238
3350
  //──────────────────────────────────────────────────────────────────────────────
@@ -3312,7 +3424,17 @@ export function createGatewayServer(deps = {}) {
3312
3424
  .describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
3313
3425
  maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
3314
3426
  maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
3315
- }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
3427
+ // Phase 4 slice ζ Vibe working-directory + additional-dirs parity.
3428
+ workingDir: z
3429
+ .string()
3430
+ .min(1)
3431
+ .optional()
3432
+ .describe("Vibe --workdir <DIR>: change to this directory before running. Single value (Vibe accepts one --workdir per invocation)."),
3433
+ addDir: z
3434
+ .array(z.string())
3435
+ .optional()
3436
+ .describe("Vibe --add-dir <DIR>: additional writable workspace directories. Each entry is emitted as its own --add-dir instance (Vibe states this flag may be specified multiple times)."),
3437
+ }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, workingDir, addDir, }) => {
3316
3438
  return handleMistralRequest({ sessionManager, logger, runtime }, {
3317
3439
  prompt,
3318
3440
  promptParts,
@@ -3337,6 +3459,8 @@ export function createGatewayServer(deps = {}) {
3337
3459
  trust,
3338
3460
  maxTurns,
3339
3461
  maxPrice,
3462
+ workingDir,
3463
+ addDir,
3340
3464
  });
3341
3465
  });
3342
3466
  //──────────────────────────────────────────────────────────────────────────────
@@ -3432,6 +3556,11 @@ export function createGatewayServer(deps = {}) {
3432
3556
  .union([z.string(), z.record(z.unknown())])
3433
3557
  .optional()
3434
3558
  .describe("Claude --json-schema: JSON Schema literal (NOT a path) constraining structured output. Object values are JSON.stringify-d; string values are passed verbatim. Use with outputFormat='json'."),
3559
+ // Phase 4 slice ζ — Claude additional-workspace-dirs parity
3560
+ addDir: z
3561
+ .array(z.string())
3562
+ .optional()
3563
+ .describe("Claude --add-dir: additional directories the CLI is allowed to read/write beyond the process cwd. Each entry is emitted as its own --add-dir instance."),
3435
3564
  approvalStrategy: z
3436
3565
  .enum(["legacy", "mcp_managed"])
3437
3566
  .default("legacy")
@@ -3461,7 +3590,7 @@ export function createGatewayServer(deps = {}) {
3461
3590
  .boolean()
3462
3591
  .default(false)
3463
3592
  .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
3464
- }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
3593
+ }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, addDir, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
3465
3594
  if (systemPrompt !== undefined && appendSystemPrompt !== undefined) {
3466
3595
  return createErrorResponse("claude", 1, "", correlationId, new Error("systemPrompt and appendSystemPrompt are mutually exclusive; use one or the other (not both)."));
3467
3596
  }
@@ -3492,6 +3621,7 @@ export function createGatewayServer(deps = {}) {
3492
3621
  excludeDynamicSystemPromptSections,
3493
3622
  fallbackModel,
3494
3623
  jsonSchema,
3624
+ addDir,
3495
3625
  }, runtime);
3496
3626
  if (!("args" in prep))
3497
3627
  return prep;
@@ -3646,7 +3776,17 @@ export function createGatewayServer(deps = {}) {
3646
3776
  images: z.array(z.string()).optional().describe("Codex -i <path>: image attachments."),
3647
3777
  ignoreUserConfig: z.boolean().optional().describe("Codex --ignore-user-config."),
3648
3778
  ignoreRules: z.boolean().optional().describe("Codex --ignore-rules."),
3649
- }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, }) => {
3779
+ // Phase 4 slice ζ Codex working-dir + add-dir parity (new sessions only).
3780
+ workingDir: z
3781
+ .string()
3782
+ .min(1)
3783
+ .optional()
3784
+ .describe("Codex -C/--cd <DIR>: working root for this session. New sessions only; resume inherits the original session's cwd."),
3785
+ addDir: z
3786
+ .array(z.string())
3787
+ .optional()
3788
+ .describe("Codex --add-dir <DIR>: additional writable workspace directories (repeat per entry). New sessions only."),
3789
+ }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, workingDir, addDir, }) => {
3650
3790
  return handleCodexRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
3651
3791
  prompt,
3652
3792
  promptParts,
@@ -3675,6 +3815,8 @@ export function createGatewayServer(deps = {}) {
3675
3815
  images,
3676
3816
  ignoreUserConfig,
3677
3817
  ignoreRules,
3818
+ workingDir,
3819
+ addDir,
3678
3820
  });
3679
3821
  });
3680
3822
  server.tool("gemini_request_async", {
@@ -3841,7 +3983,37 @@ export function createGatewayServer(deps = {}) {
3841
3983
  .default(false)
3842
3984
  .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
3843
3985
  maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
3844
- }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, maxTurns, }) => {
3986
+ // Phase 4 slice ζ Grok working-directory parity.
3987
+ workingDir: z
3988
+ .string()
3989
+ .min(1)
3990
+ .optional()
3991
+ .describe("Grok --cwd <DIR>: working directory for this invocation. Lets headless callers run Grok against a directory other than the gateway process's cwd."),
3992
+ // Phase 4 slice θ — Grok HIGH parity (sandbox, rules, system-prompt-override, allow, deny).
3993
+ sandbox: z
3994
+ .string()
3995
+ .min(1)
3996
+ .optional()
3997
+ .describe("Grok --sandbox <PROFILE>: sandbox profile for filesystem and network access. Freeform per `grok --help` (no enum constraint); also settable via GROK_SANDBOX env var."),
3998
+ rules: z
3999
+ .string()
4000
+ .min(1)
4001
+ .optional()
4002
+ .describe("Grok --rules <RULES>: extra rules to append to the system prompt. Supports `@file` prefix; gateway passes the value verbatim."),
4003
+ systemPromptOverride: z
4004
+ .string()
4005
+ .min(1)
4006
+ .optional()
4007
+ .describe("Grok --system-prompt-override <PROMPT>: replace the agent's system prompt entirely."),
4008
+ allow: z
4009
+ .array(z.string())
4010
+ .optional()
4011
+ .describe("Grok --allow <RULE>: permission allow rules. Each entry → its own --allow instance."),
4012
+ deny: z
4013
+ .array(z.string())
4014
+ .optional()
4015
+ .describe("Grok --deny <RULE>: permission deny rules. Each entry → its own --deny instance."),
4016
+ }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, maxTurns, workingDir, sandbox, rules, systemPromptOverride, allow, deny, }) => {
3845
4017
  return handleGrokRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
3846
4018
  prompt,
3847
4019
  promptParts,
@@ -3864,6 +4036,12 @@ export function createGatewayServer(deps = {}) {
3864
4036
  idleTimeoutMs,
3865
4037
  forceRefresh,
3866
4038
  maxTurns,
4039
+ workingDir,
4040
+ sandbox,
4041
+ rules,
4042
+ systemPromptOverride,
4043
+ allow,
4044
+ deny,
3867
4045
  });
3868
4046
  });
3869
4047
  server.tool("mistral_request_async", {
@@ -3939,7 +4117,17 @@ export function createGatewayServer(deps = {}) {
3939
4117
  .describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
3940
4118
  maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
3941
4119
  maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
3942
- }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
4120
+ // Phase 4 slice ζ Vibe working-directory + additional-dirs parity.
4121
+ workingDir: z
4122
+ .string()
4123
+ .min(1)
4124
+ .optional()
4125
+ .describe("Vibe --workdir <DIR>: change to this directory before running. Single value per invocation."),
4126
+ addDir: z
4127
+ .array(z.string())
4128
+ .optional()
4129
+ .describe("Vibe --add-dir <DIR>: additional writable workspace directories. Each entry is emitted as its own --add-dir instance."),
4130
+ }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, workingDir, addDir, }) => {
3943
4131
  return handleMistralRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
3944
4132
  prompt,
3945
4133
  promptParts,
@@ -3963,6 +4151,8 @@ export function createGatewayServer(deps = {}) {
3963
4151
  trust,
3964
4152
  maxTurns,
3965
4153
  maxPrice,
4154
+ workingDir,
4155
+ addDir,
3966
4156
  });
3967
4157
  });
3968
4158
  server.tool("llm_job_status", {
@@ -125,6 +125,17 @@ export interface PrepareMistralRequestInput {
125
125
  * only).
126
126
  */
127
127
  maxPrice?: number;
128
+ /**
129
+ * Phase 4 slice ζ: emit `--workdir <DIR>` so Vibe changes into the named
130
+ * directory before running. Single value (Vibe accepts one --workdir).
131
+ */
132
+ workingDir?: string;
133
+ /**
134
+ * Phase 4 slice ζ: emit `--add-dir <DIR>` per directory. Vibe's `--help`
135
+ * states the flag "Can be specified multiple times" — each entry is its
136
+ * own argv pair.
137
+ */
138
+ addDir?: string[];
128
139
  }
129
140
  export interface PrepareMistralRequestResult {
130
141
  args: string[];
@@ -364,6 +375,15 @@ export interface ClaudeHighImpactFlagsInput {
364
375
  * `--output-schema`, which takes a path).
365
376
  */
366
377
  jsonSchema?: string | Record<string, unknown>;
378
+ /**
379
+ * Phase 4 slice ζ — Claude `--add-dir <dirs...>`. Additional directories the
380
+ * Claude CLI is allowed to read/write beyond the process cwd. The CLI accepts
381
+ * a single variadic flag (space-separated values) per `claude --help`; we
382
+ * emit one `--add-dir` instance per directory so each path is its own argv
383
+ * token (survives any future tightening of the variadic parser without
384
+ * changing the call site).
385
+ */
386
+ addDir?: string[];
367
387
  }
368
388
  /**
369
389
  * Emit Claude high-impact feature flags (U25) as a flat argv segment.
@@ -185,6 +185,14 @@ export function prepareMistralRequest(input) {
185
185
  if (input.maxPrice !== undefined) {
186
186
  args.push("--max-price", String(input.maxPrice));
187
187
  }
188
+ if (input.workingDir) {
189
+ args.push("--workdir", input.workingDir);
190
+ }
191
+ if (input.addDir && input.addDir.length > 0) {
192
+ for (const dir of input.addDir) {
193
+ args.push("--add-dir", dir);
194
+ }
195
+ }
188
196
  const ignoredDisallowedTools = Boolean(input.disallowedTools && input.disallowedTools.length > 0);
189
197
  return { args, env, ignoredDisallowedTools };
190
198
  }
@@ -445,6 +453,11 @@ export function prepareClaudeHighImpactFlags(input) {
445
453
  const schemaArg = typeof input.jsonSchema === "string" ? input.jsonSchema : JSON.stringify(input.jsonSchema);
446
454
  args.push("--json-schema", schemaArg);
447
455
  }
456
+ if (input.addDir && input.addDir.length > 0) {
457
+ for (const dir of input.addDir) {
458
+ args.push("--add-dir", dir);
459
+ }
460
+ }
448
461
  return args;
449
462
  }
450
463
  //──────────────────────────────────────────────────────────────────────────────
@@ -39,6 +39,8 @@ export const UPSTREAM_CLI_CONTRACTS = {
39
39
  "excludeDynamicSystemPromptSections",
40
40
  "fallbackModel",
41
41
  "jsonSchema",
42
+ // Phase 4 slice ζ
43
+ "addDir",
42
44
  "approvalStrategy",
43
45
  "mcpServers",
44
46
  "strictMcpConfig",
@@ -88,6 +90,10 @@ export const UPSTREAM_CLI_CONTRACTS = {
88
90
  arity: "one",
89
91
  description: "JSON Schema literal constraining structured output",
90
92
  },
93
+ "--add-dir": {
94
+ arity: "one",
95
+ description: "Additional workspace directory (Phase 4 slice ζ; repeat once per directory)",
96
+ },
91
97
  "--continue": { arity: "none", description: "Continue active session" },
92
98
  "--session-id": { arity: "one", description: "Session id" },
93
99
  },
@@ -128,6 +134,14 @@ export const UPSTREAM_CLI_CONTRACTS = {
128
134
  ],
129
135
  expect: "pass",
130
136
  },
137
+ {
138
+ // Phase 4 slice ζ: --add-dir wired through prepareClaudeHighImpactFlags.
139
+ // Repeated once per directory; each instance has arity:"one".
140
+ id: "claude-add-dir",
141
+ description: "Phase 4 slice ζ: repeated --add-dir is accepted",
142
+ args: ["-p", "hello", "--add-dir", "/tmp/a", "--add-dir", "/tmp/b"],
143
+ expect: "pass",
144
+ },
131
145
  ],
132
146
  },
133
147
  codex: {
@@ -164,6 +178,9 @@ export const UPSTREAM_CLI_CONTRACTS = {
164
178
  "images",
165
179
  "ignoreUserConfig",
166
180
  "ignoreRules",
181
+ // Phase 4 slice ζ
182
+ "workingDir",
183
+ "addDir",
167
184
  ],
168
185
  resumeOnlyFlags: ["--last"],
169
186
  // Phase 4 slice α (v1.8.0) verified that `codex exec resume` accepts
@@ -203,6 +220,18 @@ export const UPSTREAM_CLI_CONTRACTS = {
203
220
  "-i": { arity: "one", description: "Image path" },
204
221
  "--ignore-user-config": { arity: "none", description: "Ignore user config" },
205
222
  "--ignore-rules": { arity: "none", description: "Ignore rule files" },
223
+ // The gateway only ever emits the short form `-C` (codex 0.134.0 accepts
224
+ // both `-C` and `--cd` as aliases). The contract registers exactly what
225
+ // we emit; if a future code path emits `--cd` instead, the contract
226
+ // check will fail loudly — which is the intended catch.
227
+ "-C": {
228
+ arity: "one",
229
+ description: "Working root for the session (Phase 4 slice ζ; new sessions only)",
230
+ },
231
+ "--add-dir": {
232
+ arity: "one",
233
+ description: "Additional writable workspace directory (Phase 4 slice ζ; repeat once per directory; new sessions only)",
234
+ },
206
235
  },
207
236
  env: {},
208
237
  conformanceFixtures: [
@@ -239,6 +268,26 @@ export const UPSTREAM_CLI_CONTRACTS = {
239
268
  args: ["exec", "resume", "--search", "session-id", "hello"],
240
269
  expect: "fail",
241
270
  },
271
+ {
272
+ id: "codex-working-dir",
273
+ description: "Phase 4 slice ζ: -C <DIR> accepted on a new session",
274
+ args: ["exec", "--skip-git-repo-check", "-C", "/tmp/work", "hello"],
275
+ expect: "pass",
276
+ },
277
+ {
278
+ id: "codex-add-dir",
279
+ description: "Phase 4 slice ζ: repeated --add-dir accepted on a new session",
280
+ args: [
281
+ "exec",
282
+ "--skip-git-repo-check",
283
+ "--add-dir",
284
+ "/tmp/a",
285
+ "--add-dir",
286
+ "/tmp/b",
287
+ "hello",
288
+ ],
289
+ expect: "pass",
290
+ },
242
291
  ],
243
292
  },
244
293
  gemini: {
@@ -350,6 +399,14 @@ export const UPSTREAM_CLI_CONTRACTS = {
350
399
  "disallowedTools",
351
400
  // Phase 4 slice δ
352
401
  "maxTurns",
402
+ // Phase 4 slice ζ
403
+ "workingDir",
404
+ // Phase 4 slice θ — Grok HIGH parity
405
+ "sandbox",
406
+ "rules",
407
+ "systemPromptOverride",
408
+ "allow",
409
+ "deny",
353
410
  ],
354
411
  flags: {
355
412
  "-p": { arity: "one", description: "Prompt text" },
@@ -379,6 +436,34 @@ export const UPSTREAM_CLI_CONTRACTS = {
379
436
  pattern: /^[1-9][0-9]*$/,
380
437
  description: "Agent-loop iteration cap (Phase 4 slice δ)",
381
438
  },
439
+ "--cwd": {
440
+ arity: "one",
441
+ description: "Working directory for the invocation (Phase 4 slice ζ)",
442
+ },
443
+ // Phase 4 slice θ — Grok HIGH parity. `--sandbox` is freeform per
444
+ // `grok --help` on 0.1.210 (no `[possible values: …]` list, unlike
445
+ // --effort / --permission-mode / --output-format), so we register
446
+ // it without a `values` constraint.
447
+ "--sandbox": {
448
+ arity: "one",
449
+ description: "Sandbox profile for filesystem + network access (Phase 4 slice θ; freeform passthrough; env: GROK_SANDBOX)",
450
+ },
451
+ "--rules": {
452
+ arity: "one",
453
+ description: "Extra rules appended to the system prompt; supports `@file` prefix (Phase 4 slice θ)",
454
+ },
455
+ "--system-prompt-override": {
456
+ arity: "one",
457
+ description: "Replace the agent's system prompt entirely (Phase 4 slice θ)",
458
+ },
459
+ "--allow": {
460
+ arity: "one",
461
+ description: "Permission allow rule (Phase 4 slice θ; repeat once per rule per `grok --help`)",
462
+ },
463
+ "--deny": {
464
+ arity: "one",
465
+ description: "Permission deny rule (Phase 4 slice θ; repeat once per rule per `grok --help`)",
466
+ },
382
467
  },
383
468
  env: {},
384
469
  conformanceFixtures: [
@@ -406,6 +491,42 @@ export const UPSTREAM_CLI_CONTRACTS = {
406
491
  args: ["-p", "hello", "--max-turns", "0"],
407
492
  expect: "fail",
408
493
  },
494
+ {
495
+ id: "grok-working-dir",
496
+ description: "Phase 4 slice ζ: --cwd <DIR> is accepted",
497
+ args: ["-p", "hello", "--cwd", "/tmp/work"],
498
+ expect: "pass",
499
+ },
500
+ {
501
+ id: "grok-sandbox",
502
+ description: "Phase 4 slice θ: --sandbox <PROFILE> accepted (freeform)",
503
+ args: ["-p", "hello", "--sandbox", "workspace-write"],
504
+ expect: "pass",
505
+ },
506
+ {
507
+ id: "grok-rules",
508
+ description: "Phase 4 slice θ: --rules <RULES> accepted (@file prefix preserved)",
509
+ args: ["-p", "hello", "--rules", "@./rules.md"],
510
+ expect: "pass",
511
+ },
512
+ {
513
+ id: "grok-system-prompt-override",
514
+ description: "Phase 4 slice θ: --system-prompt-override <PROMPT> accepted",
515
+ args: ["-p", "hello", "--system-prompt-override", "You are a tester"],
516
+ expect: "pass",
517
+ },
518
+ {
519
+ id: "grok-allow-repeated",
520
+ description: "Phase 4 slice θ: repeated --allow <RULE> accepted",
521
+ args: ["-p", "hello", "--allow", "bash", "--allow", "edit"],
522
+ expect: "pass",
523
+ },
524
+ {
525
+ id: "grok-deny-repeated",
526
+ description: "Phase 4 slice θ: repeated --deny <RULE> accepted",
527
+ args: ["-p", "hello", "--deny", "write", "--deny", "kill"],
528
+ expect: "pass",
529
+ },
409
530
  ],
410
531
  },
411
532
  mistral: {
@@ -434,6 +555,9 @@ export const UPSTREAM_CLI_CONTRACTS = {
434
555
  // Phase 4 slice δ
435
556
  "maxTurns",
436
557
  "maxPrice",
558
+ // Phase 4 slice ζ
559
+ "workingDir",
560
+ "addDir",
437
561
  ],
438
562
  flags: {
439
563
  "-p": { arity: "one", description: "Prompt text" },
@@ -468,6 +592,14 @@ export const UPSTREAM_CLI_CONTRACTS = {
468
592
  pattern: /^(0|[1-9][0-9]*)(\.[0-9]+)?$/,
469
593
  description: "Cumulative cost cap in USD (Phase 4 slice δ, programmatic mode only)",
470
594
  },
595
+ "--workdir": {
596
+ arity: "one",
597
+ description: "Working directory for the invocation (Phase 4 slice ζ)",
598
+ },
599
+ "--add-dir": {
600
+ arity: "one",
601
+ description: "Additional writable workspace directory (Phase 4 slice ζ; repeat once per directory)",
602
+ },
471
603
  },
472
604
  env: {
473
605
  VIBE_ACTIVE_MODEL: {
@@ -512,6 +644,29 @@ export const UPSTREAM_CLI_CONTRACTS = {
512
644
  env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
513
645
  expect: "fail",
514
646
  },
647
+ {
648
+ id: "mistral-working-dir",
649
+ description: "Phase 4 slice ζ: --workdir <DIR> is accepted",
650
+ args: ["-p", "hello", "--agent", "auto-approve", "--workdir", "/tmp/work"],
651
+ env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
652
+ expect: "pass",
653
+ },
654
+ {
655
+ id: "mistral-add-dir",
656
+ description: "Phase 4 slice ζ: repeated --add-dir is accepted",
657
+ args: [
658
+ "-p",
659
+ "hello",
660
+ "--agent",
661
+ "auto-approve",
662
+ "--add-dir",
663
+ "/tmp/a",
664
+ "--add-dir",
665
+ "/tmp/b",
666
+ ],
667
+ env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
668
+ expect: "pass",
669
+ },
515
670
  ],
516
671
  },
517
672
  };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "llm-cli-gateway",
3
- "version": "1.11.0",
3
+ "version": "1.13.0",
4
4
  "mcpName": "io.github.verivus-oss/llm-cli-gateway",
5
5
  "description": "MCP server providing unified access to Claude Code, Codex, Gemini, Grok, and Mistral Vibe CLIs with session management, retry logic, async job orchestration, durable job results, and cross-LLM validation.",
6
6
  "license": "MIT",