npm - llm-cli-gateway - Versions diffs - 1.11.0 → 1.13.0 - Mend

llm-cli-gateway 1.11.0 → 1.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/CHANGELOG.md +245 -0
package/dist/index.d.ts +51 -1
package/dist/index.js +198 -8
package/dist/request-helpers.d.ts +20 -0
package/dist/request-helpers.js +13 -0
package/dist/upstream-contracts.js +155 -0
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,251 @@
 All notable changes to the llm-cli-gateway project.
+## [1.13.0] - 2026-05-27 — Phase 4 slice θ (Grok HIGH parity)
+Ships the eighth Phase 4 slice: five HIGH-impact Grok CLI flags are now
+reachable from `grok_request` and `grok_request_async`. Grok was the
+most under-wired provider per the 2026-05-27 audit; this slice closes
+the HIGH-severity gap in a single bundled PR. Three commits land
+together (feature wiring, contract registration, test-veracity
+regressions) plus this release commit.
+### Added — five HIGH-impact Grok flags
+- **`sandbox`** → `--sandbox <PROFILE>`. Freeform passthrough per
+  `grok --help` on 0.1.210 (no `[possible values: …]` listing, unlike
+  `--effort` / `--permission-mode` / `--output-format` which all
+  enumerate). Also settable via the `GROK_SANDBOX` env var. Caller
+  responsibility to pass a valid profile name. The slice deliberately
+  does **not** integrate `--sandbox` with `approvalStrategy:
+  "mcp_managed"` because the value is unbounded — Grok's approval
+  semantics are already covered by `permissionMode` + `alwaysApprove` +
+  `approvalStrategy`.
+- **`rules`** → `--rules <RULES>`. Supports `@file` prefix per
+  `grok --help` to load from a file; the gateway passes the value
+  verbatim and lets Grok parse the prefix. Bounded via
+  `z.string().min(1)`.
+- **`systemPromptOverride`** → `--system-prompt-override <PROMPT>`.
+  Distinct from Claude's `--system-prompt` / `--append-system-prompt`
+  (Grok has only one override flag, not a pair). Bounded via
+  `z.string().min(1)`.
+- **`allow`** → `--allow <RULE>` (repeatable). Each array entry is
+  emitted as its own `--allow` argv instance per `grok --help`
+  ("Repeat to add multiple rules"). NOT comma-joined like the existing
+  `--tools` / `--disallowed-tools` Grok wiring.
+- **`deny`** → `--deny <RULE>` (repeatable). Same semantics as `allow`.
+All five flags surfaced on both `grok_request` and `grok_request_async`
+(slice δ sync+async parity invariant). Threaded from MCP-side Zod
+through `GrokRequestParams` → `handleGrokRequest` /
+`handleGrokRequestAsync` → `prepareGrokRequest` argv emission.
+### Contract surface
+`UPSTREAM_CLI_CONTRACTS.grok` updates:
+- `flags["--sandbox"]` (arity:"one"; **NO `values` enum** per live
+  `grok --help` — `--sandbox` is freeform, unlike Codex's
+  read-only/workspace-write/danger-full-access enum).
+- `flags["--rules"]` (arity:"one").
+- `flags["--system-prompt-override"]` (arity:"one").
+- `flags["--allow"]` (arity:"one"; multiple instances accepted because
+  `arity:"one"` means "consumes one value per instance" not "max one
+  instance").
+- `flags["--deny"]` (arity:"one"; same).
+- `mcpParameters` array updated with five new entries.
+- Five new passing conformance fixtures (`grok-sandbox`, `grok-rules`,
+  `grok-system-prompt-override`, `grok-allow-repeated`,
+  `grok-deny-repeated`); each is mechanically validated against
+  `validateUpstreamCliArgs` in the REGRESSIONS Tε suite, closing the
+  fixture-existence-vs-mechanical-validation gap identified in slice ε
+  round 1.
+### Out of scope
+- **Approval-manager integration for `--sandbox`** — explicitly
+  deferred. Grok's sandbox value is freeform per the live CLI surface;
+  integrating it with the approval manager (as Codex does for its
+  bounded enum) would require either (a) hardcoding an allowlist of
+  profile names in the gateway, or (b) a different security model
+  where the caller asserts the profile is "safe enough". Neither is
+  obvious from current Grok docs. Revisit when Grok ships an enum or
+  publishes a sandbox-profile taxonomy.
+### Test-veracity audit
+Per the standing protocol
+(`feedback_test_veracity_audit_protocol`), this slice's tests were
+audited by four LLM reviewers (Codex, Grok, Mistral, Claude) in async
+parallel with mandatory mutation-probe execution against
+`docs/plans/test-veracity-audit-slice-theta.spec.md`.
+**Round 1 outcomes:**
+- Codex: UNCONDITIONAL APPROVE — all 12 probes [as predicted], all
+  26 tests VERIFIED. Baseline (`npm test`: 55 files / 884 tests; build
+  + format:check clean; slice file 31/31).
+- Grok: UNCONDITIONAL APPROVE — all 12 probes [as predicted]; ran in
+  an isolated worktree at `/tmp/theta-audit-grok` per the slice-ζ
+  reviewer-stomping lesson.
+- Mistral: UNCONDITIONAL APPROVE — all 12 probes [as predicted].
+- Claude: UNCONDITIONAL APPROVE — all 12 probes [as predicted]; noted
+  the extra Tε-2 test (custom-profile freeform regression probe) goes
+  beyond the spec and closes the "enum-mistake stays silent if fixture
+  uses a listed value" gap.
+- Gemini: **FAILED at 10s** with `TerminalQuotaError: You have
+  exhausted your capacity on this model. Your quota will reset after
+  52m10s.` (Google 429). Documented quota blocker per protocol clause
+  5+6 — counts as "concrete unfixable when documented". Four
+  substantive valid approves from independent vendor families (OpenAI,
+  xAI, Mistral, Anthropic) satisfy the gate.
+The 31 new tests (853 → 884 total) cover every new field/flag/fixture
+across REGRESSIONS Tα/β/ε:
+- **Tα** — Registered tool inputSchema for every new field on both
+  sync and async tools, including `.min(1)` empty-string rejection on
+  the three string fields (sandbox, rules, systemPromptOverride).
+- **Tβ** — `prepareGrokRequest` end-to-end argv emission per flag.
+  Explicit "repeated `--allow`/`--deny` instances, NOT comma-joined
+  like `--tools`" assertions catch the comma-join regression class. An
+  "@file prefix passes through verbatim" assertion catches a "helpful
+  preprocessor" regression. Prepare → contract end-to-end via
+  `validateUpstreamCliArgs` (REGRESSIONS D pattern; closes the slice
+  α/γ/δ contract-table gap class).
+- **Tε** — `UPSTREAM_CLI_CONTRACTS` introspection + mechanical fixture
+  validation in the same `it()` block. Explicit assertion that
+  `--sandbox` has **no `values` enum** (catches the "freeform vs enum"
+  regression that an over-zealous future contributor might introduce).
+  Extra Tε-2 probe asserts a non-standard sandbox profile passes
+  `validateUpstreamCliArgs`.
+### Mechanical anchors (verify with `rg` before relying)
+- `src/index.ts` — `prepareGrokRequest` signature gains five fields
+  (`:1968-1995`), emission block (`:2088-2110`), `GrokRequestParams`
+  interface (`:2819-2829`), `handleGrokRequest` threading
+  (`:2854-2858`), `handleGrokRequestAsync` threading (`:3041-3045`),
+  sync `grok_request` Zod registration (`:4890-4922`), async
+  `grok_request_async` Zod registration (`:5906-5938`).
+- `src/upstream-contracts.ts` — `grok.mcpParameters` (`:459-463`),
+  `grok.flags` entries (`:501-524`), conformance fixtures
+  (`:559-587`).
+## [1.12.0] - 2026-05-27 — Phase 4 slice ζ (working-dir + add-dir cross-provider)
+Ships the seventh Phase 4 slice: working-directory and additional-directory
+flags are now reachable across four CLIs in a single bundled PR. Three
+commits land together (feature wiring, contract registration, test-veracity
+regressions) plus this release commit.
+### Added — working-dir + add-dir parity for four CLIs
+- **Claude** — `claude_request` and `claude_request_async` accept a new
+  `addDir: string[]` field. Threaded through `prepareClaudeRequest` →
+  `prepareClaudeHighImpactFlags` (`src/request-helpers.ts:687`). Each
+  entry emits its own `--add-dir` instance per `claude --help` ("Additional
+  directories to allow tool access to"). Claude has no working-dir flag
+  (uses the process cwd).
+- **Codex** — `codex_request` and `codex_request_async` accept new
+  `workingDir: string` (min 1) and `addDir: string[]` fields. Both flags
+  are already in `CODEX_RESUME_FILTERED_FLAGS` (the original session's cwd
+  and writable-dir policy are inherited on resume), so `prepareCodexRequest`
+  gates emission on `sessionPlan.mode === "new"` — resume argv stays clean
+  rather than emitting then stripping. Emits `-C <DIR>` (one) and
+  `--add-dir <DIR>` (one instance per entry).
+- **Grok** — `grok_request` and `grok_request_async` accept a new
+  `workingDir: string` (min 1) field. `prepareGrokRequest` emits
+  `--cwd <DIR>`. Grok has no `--add-dir` analogue.
+- **Vibe (Mistral)** — `mistral_request` and `mistral_request_async`
+  accept new `workingDir: string` (min 1) and `addDir: string[]` fields.
+  `prepareMistralRequest` (the `request-helpers.ts` helper) emits
+  `--workdir <DIR>` (one) and `--add-dir <DIR>` (one per entry; Vibe's
+  `--help` states the flag "Can be specified multiple times").
+  `buildMistralRetryPrep` threads both fields through to the stale-model
+  recovery argv per the slice-δ retry-path invariant.
+- **Gemini** is not re-wired: `--include-directories` was wired in master
+  before this slice. A regression-guard test in REGRESSIONS Zε asserts
+  the existing wiring stays intact while adjacent contract entries
+  changed.
+### Out of scope — worktree flags
+Worktree flags (`-w/--worktree` on Claude, Gemini, Grok) create new git
+worktree directories on disk with lifecycle implications and are
+explicitly deferred to a later slice with explicit cleanup semantics.
+### Contract surface
+`UPSTREAM_CLI_CONTRACTS` updates:
+- `claude.flags["--add-dir"]` (arity:"one"; repeated instances accepted)
+- `codex.flags["-C"]` (the gateway only emits the short form; codex
+  0.134.0 accepts `--cd` as an alias but the contract registers exactly
+  what we emit — a future code path that emitted `--cd` would correctly
+  fail the contract check).
+- `codex.flags["--add-dir"]`
+- `grok.flags["--cwd"]`
+- `mistral.flags["--workdir"]`
+- `mistral.flags["--add-dir"]`
+- `mcpParameters` arrays updated for all four CLIs.
+- Six new passing conformance fixtures (`claude-add-dir`,
+  `codex-working-dir`, `codex-add-dir`, `grok-working-dir`,
+  `mistral-working-dir`, `mistral-add-dir`); each is mechanically
+  validated against `validateUpstreamCliArgs` in the REGRESSIONS Zε
+  suite, closing the gap class identified in slice ε round 1.
+### Test-veracity audit
+Per the standing protocol (`feedback_test_veracity_audit_protocol`),
+this slice's tests were audited by all five LLM reviewers (Codex,
+Gemini, Grok, Mistral, Claude) in async parallel with mandatory
+mutation-probe execution against `docs/plans/test-veracity-audit-slice-zeta.spec.md`.
+**Round 1 outcomes:**
+- Codex: UNCONDITIONAL APPROVE — all 13 probes [as predicted], all 37
+  tests VERIFIED. Baseline (`npx vitest run` on the slice file: 37/37;
+  `npm test`: 54 files / 853 tests; build + format:check clean).
+- Grok: UNCONDITIONAL APPROVE — all 13 probes [as predicted].
+- Mistral: UNCONDITIONAL APPROVE — all 13 probes [as predicted].
+- Claude: UNCONDITIONAL APPROVE — all 13 probes red as predicted; ran
+  in an isolated `/tmp/zeta-audit-claude` worktree because the four
+  parallel reviewers were concurrently mutating the live tree.
+- Gemini: UNCONDITIONAL APPROVE — all 13 probes [as predicted].
+First unanimous round-1 pass on a multi-CLI slice. The 37 new tests
+(816 → 853 total) cover every new field/flag/fixture across REGRESSIONS
+Zα/β/ε:
+- **Zα** — Registered tool inputSchema for every new field on every
+  tool (sync + async), including `.min(1)` empty-string rejection on
+  `workingDir`.
+- **Zβ** — `prepare*Request` end-to-end argv emission per CLI. The
+  Codex resume branch asserts NEITHER `-C` NOR `--add-dir` appears
+  in resume argv. `buildMistralRetryPrep` regression catches the
+  slice-δ retry-path bug class. Prepare → contract end-to-end
+  consistency covers all four CLIs.
+- **Zε** — `UPSTREAM_CLI_CONTRACTS` introspection + mechanical
+  fixture validation in the same `it()` block (slice-ε round-1 gap
+  class). Includes a regression guard for the pre-existing Gemini
+  `--include-directories` wiring.
+### Mechanical anchors (verify with `rg` before relying)
+- `src/request-helpers.ts` — `ClaudeHighImpactFlagsInput.addDir`
+  (`:610`), `prepareClaudeHighImpactFlags` emission (`:686-690`).
+  `PrepareMistralRequestInput.workingDir`/`.addDir` (`:248-264`),
+  `prepareMistralRequest` emission (`:300-307`).
+- `src/index.ts` — `prepareClaudeRequest` (`:1338`),
+  `prepareCodexRequest` new-session gate (`:1687-1700`),
+  `prepareGrokRequest` `--cwd` emission (`:2065-2067`),
+  `prepareMistralRequest` wrapper (`:2153-2168`),
+  `buildMistralRetryPrep` (`:2249-2289`).
+- `src/upstream-contracts.ts` — flag registrations and conformance
+  fixtures for the four CLIs (`:146-149`, `:281-292`, `:438-441`,
+  `:524-533`, plus `mcpParameters` entries).
 ## [1.11.0] - 2026-05-27 — Phase 4 slice η (Claude `--fallback-model` + `--json-schema`)
 Ships the sixth Phase 4 slice: Claude's reliability fallback and

package/dist/index.d.ts CHANGED Viewed

@@ -157,6 +157,7 @@ export declare function prepareClaudeRequest(params: {
     excludeDynamicSystemPromptSections?: boolean;
     fallbackModel?: string;
     jsonSchema?: string | Record<string, unknown>;
+    addDir?: string[];
 }, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
 export interface CodexRequestPrep extends CliRequestPrep {
     /**
@@ -199,6 +200,8 @@ export declare function prepareCodexRequest(params: {
     images?: string[];
     ignoreUserConfig?: boolean;
     ignoreRules?: boolean;
+    workingDir?: string;
+    addDir?: string[];
 }, runtime?: GatewayServerRuntime): CodexRequestPrep | ExtendedToolResponse;
 export declare function prepareGeminiRequest(params: {
     prompt?: string;
@@ -254,6 +257,31 @@ export declare function prepareGrokRequest(params: {
      * iterations for cost / latency control. Mirrors Claude's wiring.
      */
     maxTurns?: number;
+    /**
+     * Phase 4 slice ζ: emit `--cwd <DIR>` so headless callers can set Grok's
+     * working directory without depending on the gateway process's cwd.
+     */
+    workingDir?: string;
+    /**
+     * Phase 4 slice θ — Grok HIGH parity. All five are passthrough flags:
+     *
+     * - `sandbox` → `--sandbox <PROFILE>` (freeform; Grok 0.1.210 --help
+     *   shows no enum constraint, unlike --effort / --permission-mode /
+     *   --output-format which all show `[possible values: …]`).
+     * - `rules` → `--rules <RULES>`. Supports `@file` prefix; gateway
+     *   passes the value verbatim and lets Grok parse it.
+     * - `systemPromptOverride` → `--system-prompt-override <PROMPT>`.
+     *   Distinct from Claude's --system-prompt / --append-system-prompt
+     *   (Grok has only one override flag).
+     * - `allow` / `deny` → repeatable `--allow <RULE>` / `--deny <RULE>`
+     *   per --help ("Repeat to add multiple rules"). One argv pair per
+     *   entry — NOT comma-joined like --tools / --disallowed-tools.
+     */
+    sandbox?: string;
+    rules?: string;
+    systemPromptOverride?: string;
+    allow?: string[];
+    deny?: string[];
 }, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
 export declare function prepareMistralRequest(params: {
     prompt?: string;
@@ -280,6 +308,10 @@ export declare function prepareMistralRequest(params: {
     maxTurns?: number;
     /** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
     maxPrice?: number;
+    /** Phase 4 slice ζ: Vibe `--workdir <DIR>` working-directory parity. */
+    workingDir?: string;
+    /** Phase 4 slice ζ: Vibe `--add-dir <DIR>` repeatable add-dir parity. */
+    addDir?: string[];
 }, runtime?: GatewayServerRuntime): (CliRequestPrep & {
     mistralEnv: Record<string, string>;
 }) | ExtendedToolResponse;
@@ -292,7 +324,7 @@ export declare function prepareMistralRequest(params: {
  * through here, or a fresh-workspace / budgeted run can degrade on
  * the second attempt.
  */
-export declare function buildMistralRetryPrep(params: Pick<MistralRequestParams, "outputFormat" | "permissionMode" | "effort" | "reasoningEffort" | "allowedTools" | "disallowedTools" | "approvalStrategy" | "trust" | "maxTurns" | "maxPrice"> & {
+export declare function buildMistralRetryPrep(params: Pick<MistralRequestParams, "outputFormat" | "permissionMode" | "effort" | "reasoningEffort" | "allowedTools" | "disallowedTools" | "approvalStrategy" | "trust" | "maxTurns" | "maxPrice" | "workingDir" | "addDir"> & {
     effectivePrompt: string;
 }, recoveryModel: string): {
     args: string[];
@@ -368,6 +400,18 @@ export interface GrokRequestParams {
     forceRefresh?: boolean;
     /** Phase 4 slice δ: cap agent-loop iterations via `--max-turns N`. */
     maxTurns?: number;
+    /** Phase 4 slice ζ: emit `--cwd <DIR>` so the CLI uses the specified working directory. */
+    workingDir?: string;
+    /** Phase 4 slice θ: Grok `--sandbox <PROFILE>` (freeform passthrough). */
+    sandbox?: string;
+    /** Phase 4 slice θ: Grok `--rules <RULES>` (supports `@file` prefix; verbatim passthrough). */
+    rules?: string;
+    /** Phase 4 slice θ: Grok `--system-prompt-override <PROMPT>`. */
+    systemPromptOverride?: string;
+    /** Phase 4 slice θ: Grok `--allow <RULE>` (repeatable; one entry per --allow instance). */
+    allow?: string[];
+    /** Phase 4 slice θ: Grok `--deny <RULE>` (repeatable; one entry per --deny instance). */
+    deny?: string[];
 }
 export declare function handleGrokRequest(deps: HandlerDeps, params: GrokRequestParams): Promise<ExtendedToolResponse>;
 export declare function handleGrokRequestAsync(deps: AsyncHandlerDeps, params: Omit<GrokRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;
@@ -398,6 +442,10 @@ export interface MistralRequestParams {
     maxTurns?: number;
     /** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
     maxPrice?: number;
+    /** Phase 4 slice ζ: Vibe `--workdir <DIR>` working-directory parity. */
+    workingDir?: string;
+    /** Phase 4 slice ζ: Vibe `--add-dir <DIR>` repeatable add-dir parity. */
+    addDir?: string[];
 }
 export declare function handleMistralRequest(deps: HandlerDeps, params: MistralRequestParams): Promise<ExtendedToolResponse>;
 export declare function handleMistralRequestAsync(deps: AsyncHandlerDeps, params: Omit<MistralRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;
@@ -430,6 +478,8 @@ export declare function handleCodexRequestAsync(deps: AsyncHandlerDeps, params:
     images?: string[];
     ignoreUserConfig?: boolean;
     ignoreRules?: boolean;
+    workingDir?: string;
+    addDir?: string[];
 }): Promise<ExtendedToolResponse>;
 export declare function createGatewayServer(deps?: GatewayServerDeps): McpServer;
 export {};

package/dist/index.js CHANGED Viewed

@@ -1007,6 +1007,7 @@ export function prepareClaudeRequest(params, runtime = resolveGatewayServerRunti
         excludeDynamicSystemPromptSections: params.excludeDynamicSystemPromptSections,
         fallbackModel: params.fallbackModel,
         jsonSchema: params.jsonSchema,
+        addDir: params.addDir,
     }));
     return {
         corrId,
@@ -1126,6 +1127,19 @@ export function prepareCodexRequest(params, runtime = resolveGatewayServerRuntim
     // and are emitted in both branches.
     let highImpactCleanup;
     if (sessionPlan.mode === "new") {
+        // Phase 4 slice ζ: emit working-dir and add-dir on new sessions only.
+        // Both flags are listed in CODEX_RESUME_FILTERED_FLAGS — resume inherits
+        // the original session's cwd and writable-dir policy, so emitting them
+        // on resume would be silently stripped (wasteful + misleading on argv
+        // logs). Gating here mirrors `--search` / `--sandbox` / `--full-auto`.
+        if (params.workingDir) {
+            args.push("-C", params.workingDir);
+        }
+        if (params.addDir && params.addDir.length > 0) {
+            for (const dir of params.addDir) {
+                args.push("--add-dir", dir);
+            }
+        }
         const high = prepareCodexHighImpactFlags({
             outputSchema: params.outputSchema,
             search: params.search,
@@ -1381,6 +1395,28 @@ export function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime
     if (params.maxTurns !== undefined) {
         args.push("--max-turns", String(params.maxTurns));
     }
+    if (params.workingDir) {
+        args.push("--cwd", params.workingDir);
+    }
+    if (params.sandbox) {
+        args.push("--sandbox", params.sandbox);
+    }
+    if (params.rules) {
+        args.push("--rules", params.rules);
+    }
+    if (params.systemPromptOverride) {
+        args.push("--system-prompt-override", params.systemPromptOverride);
+    }
+    if (params.allow && params.allow.length > 0) {
+        for (const rule of params.allow) {
+            args.push("--allow", rule);
+        }
+    }
+    if (params.deny && params.deny.length > 0) {
+        for (const rule of params.deny) {
+            args.push("--deny", rule);
+        }
+    }
     return {
         corrId,
         effectivePrompt,
@@ -1467,6 +1503,8 @@ export function prepareMistralRequest(params, runtime = resolveGatewayServerRunt
         trust: params.trust,
         maxTurns: params.maxTurns,
         maxPrice: params.maxPrice,
+        workingDir: params.workingDir,
+        addDir: params.addDir,
     });
     if (prep.ignoredDisallowedTools) {
         runtime.logger.info(`[${corrId}] Mistral does not support disallowedTools; ignoring (caller passed ${params.disallowedTools?.length ?? 0} entries)`);
@@ -1521,6 +1559,8 @@ export function buildMistralRetryPrep(params, recoveryModel) {
         trust: params.trust,
         maxTurns: params.maxTurns,
         maxPrice: params.maxPrice,
+        workingDir: params.workingDir,
+        addDir: params.addDir,
     });
 }
 function buildCliResponse(cli, stdout, optimizeResponse, corrId, sessionId, prep, durationMs, resumable, outputFormat, warnings) {
@@ -1862,6 +1902,12 @@ export async function handleGrokRequest(deps, params) {
         optimizePrompt: params.optimizePrompt,
         operation: "grok_request",
         maxTurns: params.maxTurns,
+        workingDir: params.workingDir,
+        sandbox: params.sandbox,
+        rules: params.rules,
+        systemPromptOverride: params.systemPromptOverride,
+        allow: params.allow,
+        deny: params.deny,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -1983,6 +2029,12 @@ export async function handleGrokRequestAsync(deps, params) {
         optimizePrompt: params.optimizePrompt,
         operation: "grok_request_async",
         maxTurns: params.maxTurns,
+        workingDir: params.workingDir,
+        sandbox: params.sandbox,
+        rules: params.rules,
+        systemPromptOverride: params.systemPromptOverride,
+        allow: params.allow,
+        deny: params.deny,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -2067,6 +2119,8 @@ export async function handleMistralRequest(deps, params) {
         trust: params.trust,
         maxTurns: params.maxTurns,
         maxPrice: params.maxPrice,
+        workingDir: params.workingDir,
+        addDir: params.addDir,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -2202,6 +2256,8 @@ export async function handleMistralRequestAsync(deps, params) {
         trust: params.trust,
         maxTurns: params.maxTurns,
         maxPrice: params.maxPrice,
+        workingDir: params.workingDir,
+        addDir: params.addDir,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -2290,6 +2346,8 @@ export async function handleCodexRequestAsync(deps, params) {
         images: params.images,
         ignoreUserConfig: params.ignoreUserConfig,
         ignoreRules: params.ignoreRules,
+        workingDir: params.workingDir,
+        addDir: params.addDir,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -2493,6 +2551,11 @@ export function createGatewayServer(deps = {}) {
             .union([z.string(), z.record(z.unknown())])
             .optional()
             .describe("Claude --json-schema: JSON Schema literal (NOT a path) constraining structured output. Object values are JSON.stringify-d; string values are passed verbatim. Use with outputFormat='json'."),
+        // Phase 4 slice ζ — Claude additional-workspace-dirs parity
+        addDir: z
+            .array(z.string())
+            .optional()
+            .describe("Claude --add-dir: additional directories the CLI is allowed to read/write beyond the process cwd. Each entry is emitted as its own --add-dir instance."),
         approvalStrategy: z
             .enum(["legacy", "mcp_managed"])
             .default("legacy")
@@ -2523,7 +2586,7 @@ export function createGatewayServer(deps = {}) {
             .boolean()
             .default(false)
             .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-    }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
+    }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, addDir, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
         const startTime = Date.now();
         if (systemPrompt !== undefined && appendSystemPrompt !== undefined) {
             return createErrorResponse("claude", 1, "", correlationId, new Error("systemPrompt and appendSystemPrompt are mutually exclusive; use one or the other (not both)."));
@@ -2555,6 +2618,7 @@ export function createGatewayServer(deps = {}) {
             excludeDynamicSystemPromptSections,
             fallbackModel,
             jsonSchema,
+            addDir,
         }, runtime);
         if (!("args" in prep))
             return prep;
@@ -2809,7 +2873,17 @@ export function createGatewayServer(deps = {}) {
             .boolean()
             .optional()
             .describe("Codex --ignore-rules: skip project rule files for this run."),
-    }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, }) => {
+        // Phase 4 slice ζ — Codex working-dir + add-dir parity (new sessions only).
+        workingDir: z
+            .string()
+            .min(1)
+            .optional()
+            .describe("Codex -C/--cd <DIR>: working root for this session. Emitted on new sessions only; resume inherits the original session's cwd via CODEX_RESUME_FILTERED_FLAGS."),
+        addDir: z
+            .array(z.string())
+            .optional()
+            .describe("Codex --add-dir <DIR>: additional writable workspace directories. Emitted once per entry on new sessions only; resume inherits the original session's writable-dir policy."),
+    }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, workingDir, addDir, }) => {
         const startTime = Date.now();
         const prep = prepareCodexRequest({
             prompt,
@@ -2838,6 +2912,8 @@ export function createGatewayServer(deps = {}) {
             images,
             ignoreUserConfig,
             ignoreRules,
+            workingDir,
+            addDir,
         }, runtime);
         if (!("args" in prep))
             return prep;
@@ -3209,7 +3285,37 @@ export function createGatewayServer(deps = {}) {
             .default(false)
             .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
         maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
-    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, maxTurns, }) => {
+        // Phase 4 slice ζ — Grok working-directory parity.
+        workingDir: z
+            .string()
+            .min(1)
+            .optional()
+            .describe("Grok --cwd <DIR>: working directory for this invocation. Lets headless callers run Grok against a directory other than the gateway process's cwd."),
+        // Phase 4 slice θ — Grok HIGH parity (sandbox, rules, system-prompt-override, allow, deny).
+        sandbox: z
+            .string()
+            .min(1)
+            .optional()
+            .describe("Grok --sandbox <PROFILE>: sandbox profile for filesystem and network access. Freeform per `grok --help` (no enum constraint on Grok 0.1.210); also settable via GROK_SANDBOX env var. Caller responsibility to pass a valid profile name."),
+        rules: z
+            .string()
+            .min(1)
+            .optional()
+            .describe("Grok --rules <RULES>: extra rules to append to the system prompt. Supports `@file` prefix per `grok --help` to load from a file; gateway passes the value verbatim and lets Grok parse the prefix."),
+        systemPromptOverride: z
+            .string()
+            .min(1)
+            .optional()
+            .describe("Grok --system-prompt-override <PROMPT>: replace the agent's system prompt entirely. Distinct from Claude's --system-prompt / --append-system-prompt (Grok has only one override flag, not a pair)."),
+        allow: z
+            .array(z.string())
+            .optional()
+            .describe('Grok --allow <RULE>: permission allow rules. Each entry is emitted as its own --allow instance (per `grok --help`: "Repeat to add multiple rules").'),
+        deny: z
+            .array(z.string())
+            .optional()
+            .describe('Grok --deny <RULE>: permission deny rules. Each entry is emitted as its own --deny instance (per `grok --help`: "Repeat to add multiple rules").'),
+    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, maxTurns, workingDir, sandbox, rules, systemPromptOverride, allow, deny, }) => {
         return handleGrokRequest({ sessionManager, logger, runtime }, {
             prompt,
             promptParts,
@@ -3233,6 +3339,12 @@ export function createGatewayServer(deps = {}) {
             idleTimeoutMs,
             forceRefresh,
             maxTurns,
+            workingDir,
+            sandbox,
+            rules,
+            systemPromptOverride,
+            allow,
+            deny,
         });
     });
     //──────────────────────────────────────────────────────────────────────────────
@@ -3312,7 +3424,17 @@ export function createGatewayServer(deps = {}) {
             .describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
         maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
         maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
-    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
+        // Phase 4 slice ζ — Vibe working-directory + additional-dirs parity.
+        workingDir: z
+            .string()
+            .min(1)
+            .optional()
+            .describe("Vibe --workdir <DIR>: change to this directory before running. Single value (Vibe accepts one --workdir per invocation)."),
+        addDir: z
+            .array(z.string())
+            .optional()
+            .describe("Vibe --add-dir <DIR>: additional writable workspace directories. Each entry is emitted as its own --add-dir instance (Vibe states this flag may be specified multiple times)."),
+    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, workingDir, addDir, }) => {
         return handleMistralRequest({ sessionManager, logger, runtime }, {
             prompt,
             promptParts,
@@ -3337,6 +3459,8 @@ export function createGatewayServer(deps = {}) {
             trust,
             maxTurns,
             maxPrice,
+            workingDir,
+            addDir,
         });
     });
     //──────────────────────────────────────────────────────────────────────────────
@@ -3432,6 +3556,11 @@ export function createGatewayServer(deps = {}) {
                 .union([z.string(), z.record(z.unknown())])
                 .optional()
                 .describe("Claude --json-schema: JSON Schema literal (NOT a path) constraining structured output. Object values are JSON.stringify-d; string values are passed verbatim. Use with outputFormat='json'."),
+            // Phase 4 slice ζ — Claude additional-workspace-dirs parity
+            addDir: z
+                .array(z.string())
+                .optional()
+                .describe("Claude --add-dir: additional directories the CLI is allowed to read/write beyond the process cwd. Each entry is emitted as its own --add-dir instance."),
             approvalStrategy: z
                 .enum(["legacy", "mcp_managed"])
                 .default("legacy")
@@ -3461,7 +3590,7 @@ export function createGatewayServer(deps = {}) {
                 .boolean()
                 .default(false)
                 .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-        }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
+        }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, addDir, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
             if (systemPrompt !== undefined && appendSystemPrompt !== undefined) {
                 return createErrorResponse("claude", 1, "", correlationId, new Error("systemPrompt and appendSystemPrompt are mutually exclusive; use one or the other (not both)."));
             }
@@ -3492,6 +3621,7 @@ export function createGatewayServer(deps = {}) {
                 excludeDynamicSystemPromptSections,
                 fallbackModel,
                 jsonSchema,
+                addDir,
             }, runtime);
             if (!("args" in prep))
                 return prep;
@@ -3646,7 +3776,17 @@ export function createGatewayServer(deps = {}) {
             images: z.array(z.string()).optional().describe("Codex -i <path>: image attachments."),
             ignoreUserConfig: z.boolean().optional().describe("Codex --ignore-user-config."),
             ignoreRules: z.boolean().optional().describe("Codex --ignore-rules."),
-        }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, }) => {
+            // Phase 4 slice ζ — Codex working-dir + add-dir parity (new sessions only).
+            workingDir: z
+                .string()
+                .min(1)
+                .optional()
+                .describe("Codex -C/--cd <DIR>: working root for this session. New sessions only; resume inherits the original session's cwd."),
+            addDir: z
+                .array(z.string())
+                .optional()
+                .describe("Codex --add-dir <DIR>: additional writable workspace directories (repeat per entry). New sessions only."),
+        }, async ({ prompt, promptParts, model, fullAuto, sandboxMode, askForApproval, useLegacyFullAutoFlag, dangerouslyBypassApprovalsAndSandbox, approvalStrategy, approvalPolicy, mcpServers, sessionId, resumeLatest, createNewSession, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, outputFormat, outputSchema, search, profile, configOverrides, ephemeral, images, ignoreUserConfig, ignoreRules, workingDir, addDir, }) => {
             return handleCodexRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
                 prompt,
                 promptParts,
@@ -3675,6 +3815,8 @@ export function createGatewayServer(deps = {}) {
                 images,
                 ignoreUserConfig,
                 ignoreRules,
+                workingDir,
+                addDir,
             });
         });
         server.tool("gemini_request_async", {
@@ -3841,7 +3983,37 @@ export function createGatewayServer(deps = {}) {
                 .default(false)
                 .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
             maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
-        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, maxTurns, }) => {
+            // Phase 4 slice ζ — Grok working-directory parity.
+            workingDir: z
+                .string()
+                .min(1)
+                .optional()
+                .describe("Grok --cwd <DIR>: working directory for this invocation. Lets headless callers run Grok against a directory other than the gateway process's cwd."),
+            // Phase 4 slice θ — Grok HIGH parity (sandbox, rules, system-prompt-override, allow, deny).
+            sandbox: z
+                .string()
+                .min(1)
+                .optional()
+                .describe("Grok --sandbox <PROFILE>: sandbox profile for filesystem and network access. Freeform per `grok --help` (no enum constraint); also settable via GROK_SANDBOX env var."),
+            rules: z
+                .string()
+                .min(1)
+                .optional()
+                .describe("Grok --rules <RULES>: extra rules to append to the system prompt. Supports `@file` prefix; gateway passes the value verbatim."),
+            systemPromptOverride: z
+                .string()
+                .min(1)
+                .optional()
+                .describe("Grok --system-prompt-override <PROMPT>: replace the agent's system prompt entirely."),
+            allow: z
+                .array(z.string())
+                .optional()
+                .describe("Grok --allow <RULE>: permission allow rules. Each entry → its own --allow instance."),
+            deny: z
+                .array(z.string())
+                .optional()
+                .describe("Grok --deny <RULE>: permission deny rules. Each entry → its own --deny instance."),
+        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, maxTurns, workingDir, sandbox, rules, systemPromptOverride, allow, deny, }) => {
             return handleGrokRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
                 prompt,
                 promptParts,
@@ -3864,6 +4036,12 @@ export function createGatewayServer(deps = {}) {
                 idleTimeoutMs,
                 forceRefresh,
                 maxTurns,
+                workingDir,
+                sandbox,
+                rules,
+                systemPromptOverride,
+                allow,
+                deny,
             });
         });
         server.tool("mistral_request_async", {
@@ -3939,7 +4117,17 @@ export function createGatewayServer(deps = {}) {
                 .describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
             maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
             maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
-        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
+            // Phase 4 slice ζ — Vibe working-directory + additional-dirs parity.
+            workingDir: z
+                .string()
+                .min(1)
+                .optional()
+                .describe("Vibe --workdir <DIR>: change to this directory before running. Single value per invocation."),
+            addDir: z
+                .array(z.string())
+                .optional()
+                .describe("Vibe --add-dir <DIR>: additional writable workspace directories. Each entry is emitted as its own --add-dir instance."),
+        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, workingDir, addDir, }) => {
             return handleMistralRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
                 prompt,
                 promptParts,
@@ -3963,6 +4151,8 @@ export function createGatewayServer(deps = {}) {
                 trust,
                 maxTurns,
                 maxPrice,
+                workingDir,
+                addDir,
             });
         });
         server.tool("llm_job_status", {

package/dist/request-helpers.d.ts CHANGED Viewed

@@ -125,6 +125,17 @@ export interface PrepareMistralRequestInput {
      * only).
      */
     maxPrice?: number;
+    /**
+     * Phase 4 slice ζ: emit `--workdir <DIR>` so Vibe changes into the named
+     * directory before running. Single value (Vibe accepts one --workdir).
+     */
+    workingDir?: string;
+    /**
+     * Phase 4 slice ζ: emit `--add-dir <DIR>` per directory. Vibe's `--help`
+     * states the flag "Can be specified multiple times" — each entry is its
+     * own argv pair.
+     */
+    addDir?: string[];
 }
 export interface PrepareMistralRequestResult {
     args: string[];
@@ -364,6 +375,15 @@ export interface ClaudeHighImpactFlagsInput {
      * `--output-schema`, which takes a path).
      */
     jsonSchema?: string | Record<string, unknown>;
+    /**
+     * Phase 4 slice ζ — Claude `--add-dir <dirs...>`. Additional directories the
+     * Claude CLI is allowed to read/write beyond the process cwd. The CLI accepts
+     * a single variadic flag (space-separated values) per `claude --help`; we
+     * emit one `--add-dir` instance per directory so each path is its own argv
+     * token (survives any future tightening of the variadic parser without
+     * changing the call site).
+     */
+    addDir?: string[];
 }
 /**
  * Emit Claude high-impact feature flags (U25) as a flat argv segment.

package/dist/request-helpers.js CHANGED Viewed

@@ -185,6 +185,14 @@ export function prepareMistralRequest(input) {
     if (input.maxPrice !== undefined) {
         args.push("--max-price", String(input.maxPrice));
     }
+    if (input.workingDir) {
+        args.push("--workdir", input.workingDir);
+    }
+    if (input.addDir && input.addDir.length > 0) {
+        for (const dir of input.addDir) {
+            args.push("--add-dir", dir);
+        }
+    }
     const ignoredDisallowedTools = Boolean(input.disallowedTools && input.disallowedTools.length > 0);
     return { args, env, ignoredDisallowedTools };
 }
@@ -445,6 +453,11 @@ export function prepareClaudeHighImpactFlags(input) {
         const schemaArg = typeof input.jsonSchema === "string" ? input.jsonSchema : JSON.stringify(input.jsonSchema);
         args.push("--json-schema", schemaArg);
     }
+    if (input.addDir && input.addDir.length > 0) {
+        for (const dir of input.addDir) {
+            args.push("--add-dir", dir);
+        }
+    }
     return args;
 }
 //──────────────────────────────────────────────────────────────────────────────

package/dist/upstream-contracts.js CHANGED Viewed

@@ -39,6 +39,8 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "excludeDynamicSystemPromptSections",
             "fallbackModel",
             "jsonSchema",
+            // Phase 4 slice ζ
+            "addDir",
             "approvalStrategy",
             "mcpServers",
             "strictMcpConfig",
@@ -88,6 +90,10 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 arity: "one",
                 description: "JSON Schema literal constraining structured output",
             },
+            "--add-dir": {
+                arity: "one",
+                description: "Additional workspace directory (Phase 4 slice ζ; repeat once per directory)",
+            },
             "--continue": { arity: "none", description: "Continue active session" },
             "--session-id": { arity: "one", description: "Session id" },
         },
@@ -128,6 +134,14 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 ],
                 expect: "pass",
             },
+            {
+                // Phase 4 slice ζ: --add-dir wired through prepareClaudeHighImpactFlags.
+                // Repeated once per directory; each instance has arity:"one".
+                id: "claude-add-dir",
+                description: "Phase 4 slice ζ: repeated --add-dir is accepted",
+                args: ["-p", "hello", "--add-dir", "/tmp/a", "--add-dir", "/tmp/b"],
+                expect: "pass",
+            },
         ],
     },
     codex: {
@@ -164,6 +178,9 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "images",
             "ignoreUserConfig",
             "ignoreRules",
+            // Phase 4 slice ζ
+            "workingDir",
+            "addDir",
         ],
         resumeOnlyFlags: ["--last"],
         // Phase 4 slice α (v1.8.0) verified that `codex exec resume` accepts
@@ -203,6 +220,18 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "-i": { arity: "one", description: "Image path" },
             "--ignore-user-config": { arity: "none", description: "Ignore user config" },
             "--ignore-rules": { arity: "none", description: "Ignore rule files" },
+            // The gateway only ever emits the short form `-C` (codex 0.134.0 accepts
+            // both `-C` and `--cd` as aliases). The contract registers exactly what
+            // we emit; if a future code path emits `--cd` instead, the contract
+            // check will fail loudly — which is the intended catch.
+            "-C": {
+                arity: "one",
+                description: "Working root for the session (Phase 4 slice ζ; new sessions only)",
+            },
+            "--add-dir": {
+                arity: "one",
+                description: "Additional writable workspace directory (Phase 4 slice ζ; repeat once per directory; new sessions only)",
+            },
         },
         env: {},
         conformanceFixtures: [
@@ -239,6 +268,26 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 args: ["exec", "resume", "--search", "session-id", "hello"],
                 expect: "fail",
             },
+            {
+                id: "codex-working-dir",
+                description: "Phase 4 slice ζ: -C <DIR> accepted on a new session",
+                args: ["exec", "--skip-git-repo-check", "-C", "/tmp/work", "hello"],
+                expect: "pass",
+            },
+            {
+                id: "codex-add-dir",
+                description: "Phase 4 slice ζ: repeated --add-dir accepted on a new session",
+                args: [
+                    "exec",
+                    "--skip-git-repo-check",
+                    "--add-dir",
+                    "/tmp/a",
+                    "--add-dir",
+                    "/tmp/b",
+                    "hello",
+                ],
+                expect: "pass",
+            },
         ],
     },
     gemini: {
@@ -350,6 +399,14 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "disallowedTools",
             // Phase 4 slice δ
             "maxTurns",
+            // Phase 4 slice ζ
+            "workingDir",
+            // Phase 4 slice θ — Grok HIGH parity
+            "sandbox",
+            "rules",
+            "systemPromptOverride",
+            "allow",
+            "deny",
         ],
         flags: {
             "-p": { arity: "one", description: "Prompt text" },
@@ -379,6 +436,34 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 pattern: /^[1-9][0-9]*$/,
                 description: "Agent-loop iteration cap (Phase 4 slice δ)",
             },
+            "--cwd": {
+                arity: "one",
+                description: "Working directory for the invocation (Phase 4 slice ζ)",
+            },
+            // Phase 4 slice θ — Grok HIGH parity. `--sandbox` is freeform per
+            // `grok --help` on 0.1.210 (no `[possible values: …]` list, unlike
+            // --effort / --permission-mode / --output-format), so we register
+            // it without a `values` constraint.
+            "--sandbox": {
+                arity: "one",
+                description: "Sandbox profile for filesystem + network access (Phase 4 slice θ; freeform passthrough; env: GROK_SANDBOX)",
+            },
+            "--rules": {
+                arity: "one",
+                description: "Extra rules appended to the system prompt; supports `@file` prefix (Phase 4 slice θ)",
+            },
+            "--system-prompt-override": {
+                arity: "one",
+                description: "Replace the agent's system prompt entirely (Phase 4 slice θ)",
+            },
+            "--allow": {
+                arity: "one",
+                description: "Permission allow rule (Phase 4 slice θ; repeat once per rule per `grok --help`)",
+            },
+            "--deny": {
+                arity: "one",
+                description: "Permission deny rule (Phase 4 slice θ; repeat once per rule per `grok --help`)",
+            },
         },
         env: {},
         conformanceFixtures: [
@@ -406,6 +491,42 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 args: ["-p", "hello", "--max-turns", "0"],
                 expect: "fail",
             },
+            {
+                id: "grok-working-dir",
+                description: "Phase 4 slice ζ: --cwd <DIR> is accepted",
+                args: ["-p", "hello", "--cwd", "/tmp/work"],
+                expect: "pass",
+            },
+            {
+                id: "grok-sandbox",
+                description: "Phase 4 slice θ: --sandbox <PROFILE> accepted (freeform)",
+                args: ["-p", "hello", "--sandbox", "workspace-write"],
+                expect: "pass",
+            },
+            {
+                id: "grok-rules",
+                description: "Phase 4 slice θ: --rules <RULES> accepted (@file prefix preserved)",
+                args: ["-p", "hello", "--rules", "@./rules.md"],
+                expect: "pass",
+            },
+            {
+                id: "grok-system-prompt-override",
+                description: "Phase 4 slice θ: --system-prompt-override <PROMPT> accepted",
+                args: ["-p", "hello", "--system-prompt-override", "You are a tester"],
+                expect: "pass",
+            },
+            {
+                id: "grok-allow-repeated",
+                description: "Phase 4 slice θ: repeated --allow <RULE> accepted",
+                args: ["-p", "hello", "--allow", "bash", "--allow", "edit"],
+                expect: "pass",
+            },
+            {
+                id: "grok-deny-repeated",
+                description: "Phase 4 slice θ: repeated --deny <RULE> accepted",
+                args: ["-p", "hello", "--deny", "write", "--deny", "kill"],
+                expect: "pass",
+            },
         ],
     },
     mistral: {
@@ -434,6 +555,9 @@ export const UPSTREAM_CLI_CONTRACTS = {
             // Phase 4 slice δ
             "maxTurns",
             "maxPrice",
+            // Phase 4 slice ζ
+            "workingDir",
+            "addDir",
         ],
         flags: {
             "-p": { arity: "one", description: "Prompt text" },
@@ -468,6 +592,14 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 pattern: /^(0|[1-9][0-9]*)(\.[0-9]+)?$/,
                 description: "Cumulative cost cap in USD (Phase 4 slice δ, programmatic mode only)",
             },
+            "--workdir": {
+                arity: "one",
+                description: "Working directory for the invocation (Phase 4 slice ζ)",
+            },
+            "--add-dir": {
+                arity: "one",
+                description: "Additional writable workspace directory (Phase 4 slice ζ; repeat once per directory)",
+            },
         },
         env: {
             VIBE_ACTIVE_MODEL: {
@@ -512,6 +644,29 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
                 expect: "fail",
             },
+            {
+                id: "mistral-working-dir",
+                description: "Phase 4 slice ζ: --workdir <DIR> is accepted",
+                args: ["-p", "hello", "--agent", "auto-approve", "--workdir", "/tmp/work"],
+                env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
+                expect: "pass",
+            },
+            {
+                id: "mistral-add-dir",
+                description: "Phase 4 slice ζ: repeated --add-dir is accepted",
+                args: [
+                    "-p",
+                    "hello",
+                    "--agent",
+                    "auto-approve",
+                    "--add-dir",
+                    "/tmp/a",
+                    "--add-dir",
+                    "/tmp/b",
+                ],
+                env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
+                expect: "pass",
+            },
         ],
     },
 };

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "llm-cli-gateway",
-  "version": "1.11.0",
+  "version": "1.13.0",
   "mcpName": "io.github.verivus-oss/llm-cli-gateway",
   "description": "MCP server providing unified access to Claude Code, Codex, Gemini, Grok, and Mistral Vibe CLIs with session management, retry logic, async job orchestration, durable job results, and cross-LLM validation.",
   "license": "MIT",