npm - llm-cli-gateway - Versions diffs - 1.9.0 → 1.11.0 - Mend

llm-cli-gateway 1.9.0 → 1.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/CHANGELOG.md +111 -0
package/dist/gemini-json-parser.d.ts +19 -4
package/dist/gemini-json-parser.js +73 -4
package/dist/index.d.ts +13 -6
package/dist/index.js +53 -11
package/dist/request-helpers.d.ts +14 -0
package/dist/request-helpers.js +7 -0
package/dist/upstream-contracts.js +50 -1
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,117 @@
 All notable changes to the llm-cli-gateway project.
+## [1.11.0] - 2026-05-27 — Phase 4 slice η (Claude `--fallback-model` + `--json-schema`)
+Ships the sixth Phase 4 slice: Claude's reliability fallback and
+structured-output JSON-Schema constraint flags are now reachable from
+`claude_request` and `claude_request_async`. Three commits land together
+(feature wiring, contract registration, test-veracity regressions) plus
+this release commit.
+### Added — `--fallback-model` and `--json-schema` for Claude
+- `claude_request` and `claude_request_async` accept a new `fallbackModel`
+  field (non-empty string, validated via `z.string().min(1)`). Threaded
+  through `prepareClaudeRequest` → `prepareClaudeHighImpactFlags`
+  (`src/request-helpers.ts:651`) → `--fallback-model <model>` argv pair.
+  Effective only with Claude `--print`; the gateway always passes `-p`,
+  so no extra gating required.
+- Both tools accept a new `jsonSchema` field
+  (`string | Record<string, unknown>`). Per `claude --help`, the CLI
+  argument is the JSON Schema *literal* (not a path; contrast with Codex
+  `--output-schema`). Object values are `JSON.stringify`-d; string values
+  pass verbatim. Use with `outputFormat: "json"` for structured output
+  validation. Achieves Codex parity for structured-output validation
+  in a single slice.
+- `UPSTREAM_CLI_CONTRACTS.claude.flags` registers `--fallback-model` and
+  `--json-schema` with `arity: "one"`. `mcpParameters` includes both new
+  field names. Two new passing conformance fixtures
+  (`claude-fallback-model`, `claude-json-schema`) pin the contract; both
+  are mechanically validated against `validateUpstreamCliArgs` in the
+  REGRESSIONS Hε suite.
+### Test-veracity audit
+Per the standing protocol (`feedback_test_veracity_audit_protocol`),
+this slice's tests were audited by Codex + Gemini + Grok + Mistral in
+async parallel with mandatory mutation-probe execution. Spec at
+`docs/plans/test-veracity-audit-slice-eta.spec.md`. Round 1 outcomes:
+Grok + Mistral unanimous UNCONDITIONAL APPROVE; Gemini stalled at 682B
+stderr for 15+ minutes (cancelled, documented quota/stall-class
+blocker); Codex initially REJECTED on P-Hβ-4 with an invalid claim
+("removing sync `jsonSchema` left the test green") — pre-verification
+on a clean tree confirmed the mutation does turn `Hα-4` + `Hα-6` RED as
+the spec predicts. Round-2 pushback with the verbatim vitest output:
+Codex self-corrected, reproduced the mutation in a worktree, observed
+the predicted red, restored, and issued UNCONDITIONAL APPROVE.
+Three substantive reviewer approves (Grok, Mistral, Codex) from
+independent vendor families satisfy the multi-LLM gate; Gemini stall
+documented.
+Test count: 816 → 837 (21 new across one file:
+`src/__tests__/test-veracity-regressions-slice-eta.test.ts`).
+### Known caveats
+- `npm run check` still excludes `format:check` (gap first flagged in
+  v1.8.0). Run both locally before pushing.
+- Claude `--fallback-model` and `--json-schema` are CLI-side gated to
+  `--print` mode by Claude itself; both gateway tools always pass `-p`,
+  so this is invisible to callers but worth noting if the upstream CLI
+  flag semantics change.
+## [1.10.0] - 2026-05-27 — Phase 4 slice ε (Gemini `-o stream-json` enum widening)
+Ships the fifth Phase 4 slice: Gemini's NDJSON event-stream output format
+(`-o stream-json`) is now reachable from `gemini_request` and
+`gemini_request_async`. Four commits land together: the feature wiring, a
+contract-table widening, a test-veracity regression suite, and a follow-up
+test fix driven by the multi-LLM round-1 audit.
+### Added — `outputFormat: "stream-json"` for Gemini
+- `gemini_request` and `gemini_request_async` `outputFormat` enums widened
+  from `text | json` to `text | json | stream-json`.
+- `prepareGeminiRequest` emits `-o stream-json` when the new value is set.
+  No `--include-partial-messages` analogue is required: Gemini already
+  streams stdout in real time across all output modes (covered by
+  `CLI_IDLE_TIMEOUTS.gemini = 600_000`).
+- New `parseGeminiStreamJson` parser consumes the NDJSON event stream
+  (`init` / `message` / `result` lines), concatenates assistant `delta`
+  messages into the response, and extracts
+  `input_tokens` / `output_tokens` / `cached` → `cache_read_tokens` from
+  the terminal `result.stats` event.
+- `extractUsageAndCost("gemini", _, "stream-json")` routes to the new
+  parser so usage tokens reach the flight recorder on the stream-json
+  path, matching the existing `-o json` behaviour.
+- `UPSTREAM_CLI_CONTRACTS.gemini.flags["-o"].values` widened to
+  `["json", "stream-json"]`; two new conformance fixtures
+  (`gemini-stream-json` passing, `gemini-output-format-invalid` failing
+  for `-o ndjson`) pin the enum bound.
+### Test-veracity audit
+Per the standing protocol established with v1.9.0
+(`feedback_test_veracity_audit_protocol`), this slice's tests were
+audited by Codex + Gemini + Grok + Mistral in async parallel with
+mandatory mutation-probe execution. Round 1 found one real gap
+(`Eε-4` only checked fixture presence/shape — P-Eε-1 left it green);
+closed in commit `4a78f9c` by running the fixture's args through
+`validateUpstreamCliArgs` inside the same `it()` block. Round 2
+delivered unanimous UNCONDITIONAL APPROVE across all four reviewers,
+with site-by-site probe evidence for the contested `Eα` registered-schema
+helper. Spec at `docs/plans/test-veracity-audit-slice-epsilon.spec.md`.
+Test count: 771 → 795 → 796 (24 + 1 new across two files).
+### Known caveats
+- The `npm run check` script still does not include `format:check` (a
+  gap first flagged in the v1.8.0 release notes). Run both locally
+  before pushing; CI runs format:check separately.
 ## [1.9.0] - 2026-05-27 — Phase 4 slice δ (budget/max-turns parity) + retroactive α/γ contract closure
 Ships the fourth Phase 4 slice (budget/max-turns parity for Grok and Mistral),

package/dist/gemini-json-parser.d.ts CHANGED Viewed

@@ -1,13 +1,22 @@
 /**
- * Parser for Gemini CLI `-o json` output.
+ * Parsers for Gemini CLI `-o json` (single object) and `-o stream-json`
+ * (NDJSON event stream) output.
  *
- * Gemini emits a single JSON object with:
+ * `-o json` emits a single JSON object with:
  *   - `response`: string final model output
  *   - `usageMetadata`: { promptTokenCount, candidatesTokenCount,
  *                        cachedContentTokenCount?, totalTokenCount }
  *
- * Returns null when stdout is not parseable as JSON. Returns an object with
- * only `response` when usageMetadata is missing.
+ * `-o stream-json` emits one JSON object per line:
+ *   - `{ "type": "init", "session_id": "...", "model": "..." }`
+ *   - `{ "type": "message", "role": "user", "content": "..." }`
+ *   - `{ "type": "message", "role": "assistant", "content": "...", "delta": true }` (repeated)
+ *   - `{ "type": "result", "status": "success", "stats": { "input_tokens": N,
+ *        "output_tokens": N, "cached": N, ... } }`
+ *
+ * Both parsers return null when stdout is unparseable. Both populate the same
+ * `GeminiJsonParseResult` shape so `extractUsageAndCost` can branch on
+ * outputFormat without further dispatch.
  */
 export interface GeminiUsage {
     input_tokens: number;
@@ -19,3 +28,9 @@ export interface GeminiJsonParseResult {
     response?: string;
 }
 export declare function parseGeminiJson(stdout: string): GeminiJsonParseResult | null;
+/**
+ * Parse Gemini `-o stream-json` NDJSON output. Concatenates assistant `delta`
+ * message content into `response`, extracts the terminal `result.stats` payload
+ * into `usage`. Returns null when stdout contains no parseable JSON line.
+ */
+export declare function parseGeminiStreamJson(stdout: string): GeminiJsonParseResult | null;

package/dist/gemini-json-parser.js CHANGED Viewed

@@ -1,13 +1,22 @@
 /**
- * Parser for Gemini CLI `-o json` output.
+ * Parsers for Gemini CLI `-o json` (single object) and `-o stream-json`
+ * (NDJSON event stream) output.
  *
- * Gemini emits a single JSON object with:
+ * `-o json` emits a single JSON object with:
  *   - `response`: string final model output
  *   - `usageMetadata`: { promptTokenCount, candidatesTokenCount,
  *                        cachedContentTokenCount?, totalTokenCount }
  *
- * Returns null when stdout is not parseable as JSON. Returns an object with
- * only `response` when usageMetadata is missing.
+ * `-o stream-json` emits one JSON object per line:
+ *   - `{ "type": "init", "session_id": "...", "model": "..." }`
+ *   - `{ "type": "message", "role": "user", "content": "..." }`
+ *   - `{ "type": "message", "role": "assistant", "content": "...", "delta": true }` (repeated)
+ *   - `{ "type": "result", "status": "success", "stats": { "input_tokens": N,
+ *        "output_tokens": N, "cached": N, ... } }`
+ *
+ * Both parsers return null when stdout is unparseable. Both populate the same
+ * `GeminiJsonParseResult` shape so `extractUsageAndCost` can branch on
+ * outputFormat without further dispatch.
  */
 export function parseGeminiJson(stdout) {
     const trimmed = stdout.trim();
@@ -45,3 +54,63 @@ export function parseGeminiJson(stdout) {
     }
     return result;
 }
+/**
+ * Parse Gemini `-o stream-json` NDJSON output. Concatenates assistant `delta`
+ * message content into `response`, extracts the terminal `result.stats` payload
+ * into `usage`. Returns null when stdout contains no parseable JSON line.
+ */
+export function parseGeminiStreamJson(stdout) {
+    if (!stdout) {
+        return null;
+    }
+    const lines = stdout.split(/\r?\n/);
+    const result = {};
+    const assistantChunks = [];
+    let sawAnyLine = false;
+    for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed)
+            continue;
+        // Gemini stream-json lines are individual JSON objects; non-JSON
+        // chatter (warnings, "Ripgrep not available", etc.) is silently
+        // ignored so a stray banner line doesn't poison usage extraction.
+        let event;
+        try {
+            event = JSON.parse(trimmed);
+        }
+        catch {
+            continue;
+        }
+        if (!event || typeof event !== "object")
+            continue;
+        sawAnyLine = true;
+        if (event.type === "message" &&
+            event.role === "assistant" &&
+            typeof event.content === "string") {
+            assistantChunks.push(event.content);
+            continue;
+        }
+        if (event.type === "result" && event.stats && typeof event.stats === "object") {
+            const stats = event.stats;
+            const input = typeof stats.input_tokens === "number" ? stats.input_tokens : undefined;
+            const output = typeof stats.output_tokens === "number" ? stats.output_tokens : undefined;
+            if (input !== undefined || output !== undefined) {
+                const usage = {
+                    input_tokens: input ?? 0,
+                    output_tokens: output ?? 0,
+                };
+                if (typeof stats.cached === "number") {
+                    usage.cache_read_tokens = stats.cached;
+                }
+                result.usage = usage;
+            }
+        }
+    }
+    if (!sawAnyLine) {
+        return null;
+    }
+    if (assistantChunks.length > 0) {
+        result.response = assistantChunks.join("");
+    }
+    return result;
+}

package/dist/index.d.ts CHANGED Viewed

@@ -155,6 +155,8 @@ export declare function prepareClaudeRequest(params: {
     maxTurns?: number;
     effort?: ClaudeEffortLevel;
     excludeDynamicSystemPromptSections?: boolean;
+    fallbackModel?: string;
+    jsonSchema?: string | Record<string, unknown>;
 }, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
 export interface CodexRequestPrep extends CliRequestPrep {
     /**
@@ -212,11 +214,13 @@ export declare function prepareGeminiRequest(params: {
     optimizePrompt: boolean;
     operation: string;
     /**
-     * U23: output format. When set to "json", emits `-o json` so Gemini emits
-     * the JSON object containing usageMetadata that `parseGeminiJson` (and
-     * downstream `extractUsageAndCost`) can consume. Defaults to "text".
+     * U23 + Phase 4 slice ε: output format. `json` emits `-o json` (single
+     * JSON object with usageMetadata). `stream-json` emits `-o stream-json`
+     * (NDJSON event stream — `init` / `message` / `result` lines). Both
+     * route through `extractUsageAndCost` so usage tokens reach the flight
+     * recorder. Defaults to "text".
      */
-    outputFormat?: "text" | "json";
+    outputFormat?: "text" | "json" | "stream-json";
     sandbox?: boolean;
     policyFiles?: string[];
     adminPolicyFiles?: string[];
@@ -313,8 +317,11 @@ export interface GeminiRequestParams {
     optimizeResponse?: boolean;
     idleTimeoutMs?: number;
     forceRefresh?: boolean;
-    /** U23: "json" emits `-o json` so token usage is parsed and reported. */
-    outputFormat?: "text" | "json";
+    /**
+     * U23 + Phase 4 slice ε: "json" emits `-o json`; "stream-json" emits
+     * `-o stream-json` (NDJSON event stream). Both are usage-extracted.
+     */
+    outputFormat?: "text" | "json" | "stream-json";
     sandbox?: boolean;
     policyFiles?: string[];
     adminPolicyFiles?: string[];

package/dist/index.js CHANGED Viewed

@@ -9,7 +9,7 @@ import { z } from "zod";
 import { executeCli, killAllProcessGroups } from "./executor.js";
 import { parseStreamJson } from "./stream-json-parser.js";
 import { parseCodexJsonStream } from "./codex-json-parser.js";
-import { parseGeminiJson } from "./gemini-json-parser.js";
+import { parseGeminiJson, parseGeminiStreamJson } from "./gemini-json-parser.js";
 import { parseVibeMetaJson } from "./mistral-meta-json-parser.js";
 import { homedir } from "os";
 import { createSessionManager } from "./session-manager.js";
@@ -530,8 +530,8 @@ ctx) {
             costUsd: parsed.usage.cost_usd,
         };
     }
-    if (cli === "gemini" && outputFormat === "json") {
-        const parsed = parseGeminiJson(output);
+    if (cli === "gemini" && (outputFormat === "json" || outputFormat === "stream-json")) {
+        const parsed = outputFormat === "stream-json" ? parseGeminiStreamJson(output) : parseGeminiJson(output);
         if (!parsed || !parsed.usage) {
             return {};
         }
@@ -1005,6 +1005,8 @@ export function prepareClaudeRequest(params, runtime = resolveGatewayServerRunti
         maxTurns: params.maxTurns,
         effort: params.effort,
         excludeDynamicSystemPromptSections: params.excludeDynamicSystemPromptSections,
+        fallbackModel: params.fallbackModel,
+        jsonSchema: params.jsonSchema,
     }));
     return {
         corrId,
@@ -1271,9 +1273,19 @@ export function prepareGeminiRequest(params, runtime = resolveGatewayServerRunti
     // U23 fix: emit `-o json` when the caller asked for JSON output. The Gemini
     // JSON parser is otherwise unreachable from the tool surface and the
     // structured usageMetadata is silently dropped.
+    //
+    // Phase 4 slice ε: same wiring for `-o stream-json` (NDJSON event stream).
+    // Gemini already streams stdout in real-time so the existing 10-minute
+    // idle timeout (CLI_IDLE_TIMEOUTS.gemini) covers both modes without
+    // adjustment — unlike Claude, no `--include-partial-messages` companion
+    // flag is required because Gemini emits assistant `delta` events as part
+    // of the default stream-json shape.
     if (params.outputFormat === "json") {
         args.push("-o", "json");
     }
+    else if (params.outputFormat === "stream-json") {
+        args.push("-o", "stream-json");
+    }
     // Phase 4 slice γ: opt-in trust-prompt bypass for fresh workspaces.
     if (params.skipTrust) {
         args.push("--skip-trust");
@@ -2471,6 +2483,16 @@ export function createGatewayServer(deps = {}) {
             .boolean()
             .optional()
             .describe("Claude --exclude-dynamic-system-prompt-sections: trim dynamic context blocks from the system prompt."),
+        // Phase 4 slice η — Claude reliability + structured-output parity
+        fallbackModel: z
+            .string()
+            .min(1)
+            .optional()
+            .describe("Claude --fallback-model: model name to auto-fallback to when the default model is overloaded (effective only with --print, which the gateway always uses)."),
+        jsonSchema: z
+            .union([z.string(), z.record(z.unknown())])
+            .optional()
+            .describe("Claude --json-schema: JSON Schema literal (NOT a path) constraining structured output. Object values are JSON.stringify-d; string values are passed verbatim. Use with outputFormat='json'."),
         approvalStrategy: z
             .enum(["legacy", "mcp_managed"])
             .default("legacy")
@@ -2501,7 +2523,7 @@ export function createGatewayServer(deps = {}) {
             .boolean()
             .default(false)
             .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-    }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
+    }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
         const startTime = Date.now();
         if (systemPrompt !== undefined && appendSystemPrompt !== undefined) {
             return createErrorResponse("claude", 1, "", correlationId, new Error("systemPrompt and appendSystemPrompt are mutually exclusive; use one or the other (not both)."));
@@ -2531,6 +2553,8 @@ export function createGatewayServer(deps = {}) {
             maxTurns,
             effort,
             excludeDynamicSystemPromptSections,
+            fallbackModel,
+            jsonSchema,
         }, runtime);
         if (!("args" in prep))
             return prep;
@@ -3069,11 +3093,14 @@ export function createGatewayServer(deps = {}) {
             .default(false)
             .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
         // U23: emit `-o json` to extract token usage via parseGeminiJson. Default
-        // remains text so existing callers see no behavior change.
+        // remains text so existing callers see no behavior change. Phase 4 slice
+        // ε adds `stream-json` (NDJSON event stream parsed by
+        // parseGeminiStreamJson — `init`/`message`/`result` lines, idle-timeout
+        // semantics covered by Gemini's existing real-time stdout streaming).
         outputFormat: z
-            .enum(["text", "json"])
+            .enum(["text", "json", "stream-json"])
             .default("text")
-            .describe("Gemini output format. `json` emits `-o json` so usageMetadata is parsed and reported."),
+            .describe("Gemini output format. `json` emits `-o json` (single JSON with usageMetadata). `stream-json` emits `-o stream-json` (NDJSON event stream — `init`/`message`/`result` lines, usage extracted from the terminal `result.stats` event). Both report usage to the flight recorder."),
         sandbox: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.sandbox.describe("Run Gemini in sandbox mode (-s)"),
         policyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.policyFiles.describe("Policy file paths (--policy <path>, one per file). Paths must exist."),
         adminPolicyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.adminPolicyFiles.describe("Admin policy file paths (--admin-policy <path>, one per file). Paths must exist."),
@@ -3395,6 +3422,16 @@ export function createGatewayServer(deps = {}) {
                 .boolean()
                 .optional()
                 .describe("Claude --exclude-dynamic-system-prompt-sections: trim dynamic context blocks from the system prompt."),
+            // Phase 4 slice η — Claude reliability + structured-output parity
+            fallbackModel: z
+                .string()
+                .min(1)
+                .optional()
+                .describe("Claude --fallback-model: model name to auto-fallback to when the default model is overloaded (effective only with --print, which the gateway always uses)."),
+            jsonSchema: z
+                .union([z.string(), z.record(z.unknown())])
+                .optional()
+                .describe("Claude --json-schema: JSON Schema literal (NOT a path) constraining structured output. Object values are JSON.stringify-d; string values are passed verbatim. Use with outputFormat='json'."),
             approvalStrategy: z
                 .enum(["legacy", "mcp_managed"])
                 .default("legacy")
@@ -3424,7 +3461,7 @@ export function createGatewayServer(deps = {}) {
                 .boolean()
                 .default(false)
                 .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-        }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
+        }, async ({ prompt, promptParts, model, outputFormat, sessionId, continueSession, createNewSession, allowedTools, disallowedTools, dangerouslySkipPermissions, permissionMode, agent, agents, forkSession, systemPrompt, appendSystemPrompt, maxBudgetUsd, maxTurns, effort, excludeDynamicSystemPromptSections, fallbackModel, jsonSchema, approvalStrategy, approvalPolicy, mcpServers, strictMcpConfig, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
             if (systemPrompt !== undefined && appendSystemPrompt !== undefined) {
                 return createErrorResponse("claude", 1, "", correlationId, new Error("systemPrompt and appendSystemPrompt are mutually exclusive; use one or the other (not both)."));
             }
@@ -3453,6 +3490,8 @@ export function createGatewayServer(deps = {}) {
                 maxTurns,
                 effort,
                 excludeDynamicSystemPromptSections,
+                fallbackModel,
+                jsonSchema,
             }, runtime);
             if (!("args" in prep))
                 return prep;
@@ -3691,11 +3730,14 @@ export function createGatewayServer(deps = {}) {
                 .default(false)
                 .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
             // U23: emit `-o json` to extract token usage via parseGeminiJson. Default
-            // remains text so existing callers see no behavior change.
+            // remains text so existing callers see no behavior change. Phase 4 slice
+            // ε adds `stream-json` (NDJSON event stream parsed by
+            // parseGeminiStreamJson — `init`/`message`/`result` lines, idle-timeout
+            // semantics covered by Gemini's existing real-time stdout streaming).
             outputFormat: z
-                .enum(["text", "json"])
+                .enum(["text", "json", "stream-json"])
                 .default("text")
-                .describe("Gemini output format. `json` emits `-o json` so usageMetadata is parsed and reported."),
+                .describe("Gemini output format. `json` emits `-o json` (single JSON with usageMetadata). `stream-json` emits `-o stream-json` (NDJSON event stream — `init`/`message`/`result` lines, usage extracted from the terminal `result.stats` event). Both report usage to the flight recorder."),
             sandbox: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.sandbox.describe("Run Gemini in sandbox mode (-s)"),
             policyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.policyFiles.describe("Policy file paths (--policy <path>, one per file). Paths must exist."),
             adminPolicyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.adminPolicyFiles.describe("Admin policy file paths (--admin-policy <path>, one per file). Paths must exist."),

package/dist/request-helpers.d.ts CHANGED Viewed

@@ -350,6 +350,20 @@ export interface ClaudeHighImpactFlagsInput {
     maxTurns?: number;
     effort?: ClaudeEffortLevel;
     excludeDynamicSystemPromptSections?: boolean;
+    /**
+     * Phase 4 slice η — Claude `--fallback-model <model>`. Routes overloaded-model
+     * requests to the named fallback. Only effective with `--print` (we always pass
+     * `-p`, so no extra gating required here).
+     */
+    fallbackModel?: string;
+    /**
+     * Phase 4 slice η — Claude `--json-schema <schema>`. Per `claude --help`, the
+     * argument is the JSON Schema *literal*, not a path. Object values are
+     * `JSON.stringify`-d; string values are passed verbatim (caller already wrote
+     * a JSON literal). No temp file lifecycle needed (contrast with Codex
+     * `--output-schema`, which takes a path).
+     */
+    jsonSchema?: string | Record<string, unknown>;
 }
 /**
  * Emit Claude high-impact feature flags (U25) as a flat argv segment.

package/dist/request-helpers.js CHANGED Viewed

@@ -438,6 +438,13 @@ export function prepareClaudeHighImpactFlags(input) {
     if (input.excludeDynamicSystemPromptSections) {
         args.push("--exclude-dynamic-system-prompt-sections");
     }
+    if (input.fallbackModel !== undefined) {
+        args.push("--fallback-model", input.fallbackModel);
+    }
+    if (input.jsonSchema !== undefined) {
+        const schemaArg = typeof input.jsonSchema === "string" ? input.jsonSchema : JSON.stringify(input.jsonSchema);
+        args.push("--json-schema", schemaArg);
+    }
     return args;
 }
 //──────────────────────────────────────────────────────────────────────────────

package/dist/upstream-contracts.js CHANGED Viewed

@@ -37,6 +37,8 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "maxTurns",
             "effort",
             "excludeDynamicSystemPromptSections",
+            "fallbackModel",
+            "jsonSchema",
             "approvalStrategy",
             "mcpServers",
             "strictMcpConfig",
@@ -78,6 +80,14 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 arity: "none",
                 description: "Trim dynamic system prompt sections",
             },
+            "--fallback-model": {
+                arity: "one",
+                description: "Auto-fallback model when default is overloaded (Claude --print only)",
+            },
+            "--json-schema": {
+                arity: "one",
+                description: "JSON Schema literal constraining structured output",
+            },
             "--continue": { arity: "none", description: "Continue active session" },
             "--session-id": { arity: "one", description: "Session id" },
         },
@@ -95,6 +105,29 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 args: ["-p", "hello", "--not-a-claude-flag"],
                 expect: "fail",
             },
+            {
+                // Phase 4 slice η: --fallback-model wired through prepareClaudeRequest.
+                id: "claude-fallback-model",
+                description: "Phase 4 slice η: --fallback-model accepted",
+                args: ["-p", "hello", "--fallback-model", "claude-haiku-4-5-20251001"],
+                expect: "pass",
+            },
+            {
+                // Phase 4 slice η: --json-schema accepts an inline JSON Schema literal
+                // (per `claude --help` example), not a path. Codex parity for
+                // structured-output validation in one slice.
+                id: "claude-json-schema",
+                description: "Phase 4 slice η: --json-schema accepts inline JSON literal",
+                args: [
+                    "-p",
+                    "hello",
+                    "--output-format",
+                    "json",
+                    "--json-schema",
+                    '{"type":"object","properties":{"name":{"type":"string"}},"required":["name"]}',
+                ],
+                expect: "pass",
+            },
         ],
     },
     codex: {
@@ -248,7 +281,11 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "-s": { arity: "none", description: "Sandbox mode" },
             "--policy": { arity: "one", description: "Policy file path" },
             "--admin-policy": { arity: "one", description: "Admin policy file path" },
-            "-o": { arity: "one", values: ["json"], description: "Output format" },
+            "-o": {
+                arity: "one",
+                values: ["json", "stream-json"],
+                description: "Output format (Phase 4 slice ε adds stream-json)",
+            },
             "--resume": { arity: "one", description: "Resume session" },
             "--skip-trust": {
                 arity: "none",
@@ -275,6 +312,18 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 args: ["-p", "hello", "--skip-trust"],
                 expect: "pass",
             },
+            {
+                id: "gemini-stream-json",
+                description: "Phase 4 slice ε: -o stream-json is accepted",
+                args: ["-p", "hello", "-o", "stream-json"],
+                expect: "pass",
+            },
+            {
+                id: "gemini-output-format-invalid",
+                description: "Phase 4 slice ε: -o ndjson is rejected (not in contract enum)",
+                args: ["-p", "hello", "-o", "ndjson"],
+                expect: "fail",
+            },
         ],
     },
     grok: {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "llm-cli-gateway",
-  "version": "1.9.0",
+  "version": "1.11.0",
   "mcpName": "io.github.verivus-oss/llm-cli-gateway",
   "description": "MCP server providing unified access to Claude Code, Codex, Gemini, Grok, and Mistral Vibe CLIs with session management, retry logic, async job orchestration, durable job results, and cross-LLM validation.",
   "license": "MIT",