npm - llm-cli-gateway - Versions diffs - 1.7.0 → 1.9.0 - Mend

llm-cli-gateway 1.7.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/CHANGELOG.md +170 -0
package/dist/index.d.ts +94 -0
package/dist/index.js +139 -35
package/dist/mistral-meta-json-parser.d.ts +6 -0
package/dist/mistral-meta-json-parser.js +175 -0
package/dist/request-helpers.d.ts +25 -5
package/dist/request-helpers.js +14 -5
package/dist/upstream-contracts.js +94 -9
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,176 @@
 All notable changes to the llm-cli-gateway project.
+## [1.9.0] - 2026-05-27 — Phase 4 slice δ (budget/max-turns parity) + retroactive α/γ contract closure
+Ships the fourth Phase 4 slice (budget/max-turns parity for Grok and Mistral),
+and retroactively closes three latent contract gaps that shipped silently in
+v1.8.0 (slices α and γ). Five commits land together: the slice δ feature,
+two bounds-tightening fixes, a contract-table closure, and a test-veracity
+hardening pass driven by an iterative multi-LLM audit.
+### Added — `maxTurns` / `maxPrice` budget caps (slice δ)
+- `grok_request` and `grok_request_async` gain optional `maxTurns?: number`
+  → emits `grok --max-turns N`. Grok exposes no per-request budget flag,
+  so `--max-price` is Mistral-only.
+- `mistral_request` and `mistral_request_async` gain optional
+  `maxTurns?: number` → `vibe --max-turns N` AND `maxPrice?: number` →
+  `vibe --max-price DOLLARS`. Both apply only in programmatic mode (`-p`),
+  matching Vibe's documented constraint.
+- The Mistral stale-model recovery retry path (extracted into a pure
+  `buildMistralRetryPrep` helper) preserves all three slice-γ/δ flags
+  (`trust`, `maxTurns`, `maxPrice`) on the second attempt.
+- Defaults: undefined for all three new fields → no flag emitted →
+  existing callers see no behavioural change.
+### Fixed — Bounded numeric schemas for lossless argv stringification
+- Extracted two shared, exported Zod constants:
+  - `MAX_TURNS_SCHEMA = z.number().int().positive().safe().max(10_000)`
+  - `MAX_PRICE_SCHEMA = z.number().positive().finite().min(1e-6).max(10_000)`
+- The lower `.min(1e-6)` cap on price is exactly the boundary where
+  `String(N)` switches from decimal to scientific notation
+  (`String(1e-6) === "0.000001"` but `String(1e-7) === "1e-7"`); both
+  upstream CLIs reject scientific-notation values.
+- Reused across all four slice-δ tool registrations so bounds stay
+  consistent if they ever need to change.
+### Fixed — Upstream contract table closes 5 latent flag gaps
+`assertUpstreamCliArgs` consults `UPSTREAM_CLI_CONTRACTS` on every real
+`*_request` call. The following flags / mcpParameters were never registered
+there before this release, so production calls setting any of them threw
+"Upstream contract violation" at runtime even though the prepare-function
+unit tests passed:
+- **Gemini** (slice γ retroactive): `skipTrust` + `--skip-trust`.
+- **Mistral** (slice γ + δ retroactive): `trust` + `--trust`; `maxTurns` +
+  `--max-turns`; `maxPrice` + `--max-price` (with a strict decimal-only
+  regex matching `MAX_PRICE_SCHEMA`'s lower bound).
+- **Grok** (slice δ): `maxTurns` + `--max-turns`.
+- **Codex** (slice α retroactive): `--output-schema` and `-c` removed
+  from `resumeForbiddenFlags` — verified accepted on `codex exec resume`
+  per codex-cli 0.133.0.
+Conformance fixtures pin each new flag's argv shape, including a
+`mistral-max-price-scientific-notation` fixture that locks the `1e-7`
+rejection at the contract layer.
+### Hardened — Test veracity (multi-LLM audit follow-up)
+Codex + Grok ran iterative test-veracity audits with mutation probes per
+`docs/plans/test-veracity-audit.spec.md`. They proved several added tests
+were not falsifiable on the dimensions their commit messages claimed.
+New file `src/__tests__/test-veracity-regressions.test.ts` closes those
+gaps with six describe blocks:
+- **REGRESSIONS A** — probes registered tool `inputSchema` bounds
+  directly (not the bare schema constants), so schema-drift in any of
+  the four sync/async registrations is caught.
+- **REGRESSIONS B** — tests the pure `buildMistralRetryPrep` helper
+  across all combinations of `trust × maxTurns × maxPrice`. Self-
+  validated: dropping any of the three forwards on retry goes red.
+- **REGRESSIONS C** — positive allowlist asserting slice α/γ/δ
+  parameters live in the matching contract's `mcpParameters` (closes
+  the self-oracle gap where removing a param from BOTH the contract
+  AND the schema previously stayed green).
+- **REGRESSIONS D** — threads `prepare*Request` output into
+  `validateUpstreamCliArgs` end-to-end; the exact consistency check
+  the latent v1.8.0 contract breaks would have failed.
+- **REGRESSIONS E** — `it.each` over sync AND async variants of every
+  slice-touched tool; the existing C4 was sync-only.
+- **REGRESSIONS F** — flag-fixture coverage map: every flag in each
+  contract `flags` table must be exercised by a passing fixture (with
+  a grandfathered pre-audit baseline). Forces future slice authors to
+  add a fixture alongside any new flag entry.
+The existing C4 (`MCP request schemas expose the provider contract
+parameters`) now walks `_async` tools too.
+### Notes
+Multi-LLM review across multiple iterative rounds, ending with a
+dedicated test-veracity audit per Werner's strict-evidence protocol
+(documented in `docs/plans/test-veracity-audit.spec.md`). Round 2 of the
+audit landed UNCONDITIONAL APPROVE from Codex, Grok, Claude, and Mistral
+with full mutation-probe evidence — every documented counterexample
+mutation went red as predicted; tests are falsifiable by exactly the
+regressions they claim to guard against. Gemini was quota-exhausted
+during the audit window (~6h reset) and did not participate in round 2.
+## [1.8.0] - 2026-05-27 — Phase 4 openers (codex resume fix, mistral telemetry, headless trust flags)
+Ships the first three slices of the Phase 4 provider-modernisation
+backlog, one bug fix and two small features. Multi-LLM review surfaced
+five additional bug classes during the cycle (path traversal, UUID→dir
+resolution gap, sync usage ctx drop, retry-path flag drop, symlink
+boundary bypass); all are addressed in the two follow-up fix commits.
+### Fixed — Codex `--output-schema` + `-c/--config` on `exec resume`
+- `prepareCodexRequest` previously dropped `outputSchema` and
+  `configOverrides` on the resume branch because the U26 audit assumed
+  `codex exec resume` rejected both flags. Live re-verification against
+  `codex exec resume --help` (codex-cli 0.133.0) confirms both ARE
+  accepted on resume; only `--search` remains resume-incompatible. The
+  resume branch now threads both fields through, reusing the existing
+  outputSchema temp-file materialisation + cleanup contract.
+  `CODEX_RESUME_FILTERED_FLAGS` no longer strips `--output-schema`.
+### Added — Mistral Vibe `meta.json` usage / cost telemetry
+- New `src/mistral-meta-json-parser.ts` reads
+  `~/.vibe/logs/session/session_<YYYYMMDD>_<HHMMSS>_<first8hex>/meta.json`
+  (the actual filename — an earlier TODO at `src/index.ts:750` said
+  `metadata.json`, which was incorrect). Maps `stats.session_prompt_tokens`,
+  `stats.session_completion_tokens`, and `stats.session_cost` onto the
+  gateway's `inputTokens`/`outputTokens`/`costUsd` flight-recorder
+  columns. Cache-token surfaces stay undefined — Vibe doesn't expose
+  them today.
+- The gateway's mistral sessionId surface accepts the full UUID (to match
+  `vibe --resume <uuid>`), but Vibe persists telemetry under
+  `session_<ts>_<first8>` directories. The new resolver globs by the
+  leading 8-hex prefix and verifies each candidate's `session_id` field
+  before returning — required for every UUID input including
+  single-match cases, so two UUIDs sharing the leading 8 hex chars never
+  cross-attribute usage.
+- `extractUsageAndCost` and `buildAsyncFlightRecorderHandoff` thread a
+  primitives-only `{ sessionId, home }` context so the AsyncJobRecord
+  retention stays O(constant). `buildCliResponse` passes the same ctx so
+  sync `mistral_request` resume calls populate structured usage in their
+  response (not just the flight-recorder row).
+### Added — Headless trust-prompt bypass for Gemini + Mistral
+- New optional `skipTrust?: boolean` field on `gemini_request` and
+  `gemini_request_async`, defaulting `false`. When set, emits
+  `--skip-trust` so fresh workspaces don't block headless invocations on
+  Gemini's interactive trust prompt.
+- New optional `trust?: boolean` field on `mistral_request` and
+  `mistral_request_async`, defaulting `false`. When set, emits `--trust`
+  (per-invocation only, not persisted to `trusted_folders.toml`) so
+  fresh workspaces don't block headless Vibe runs. Preserved on the
+  stale-model recovery retry path so a fresh untrusted workspace can't
+  deadlock on the second attempt.
+- Default `false` preserves existing prompt behaviour for legacy
+  callers.
+### Security
+- `parseVibeMetaJson` enforces a strict input charset (UUID-shape OR
+  `^session_\d{8}_\d{6}_[0-9a-f]{8}$` Vibe dir basename) before any
+  filesystem access.
+- New `readInBase(realBase, candidate)` helper realpath-resolves both
+  ends and rejects targets whose final inode lives outside the session
+  log root. Both the resolver's disambiguation reads and the final
+  parser read route through it, so an in-tree symlink to an
+  out-of-tree directory (or symlinked meta.json) cannot leak file
+  contents outside `~/.vibe/logs/session/`.
+- Test coverage: traversal inputs (`../`, absolute, control-char,
+  embedded `../`), single-candidate prefix-collision rejection,
+  symlink-to-outside-baseDir rejection.
 ## [1.7.0] - 2026-05-26 — cache-awareness slice 1.5 (async-path flight recorder + codex parser fix)
 Closes the two telemetry gaps that v1.6.0 explicitly deferred: async-path

package/dist/index.d.ts CHANGED Viewed

@@ -54,6 +54,19 @@ declare const logger: {
     debug: (message: string, ...args: any[]) => void;
 };
 type GatewayLogger = typeof logger;
+/**
+ * Phase 4 slice δ — shared Zod fragments for `maxTurns` / `maxPrice`.
+ *
+ * Both flags reach the upstream CLIs as decimal-formatted argv strings via
+ * `String(N)`. `z.number().int().positive()` alone lets values past
+ * `Number.MAX_SAFE_INTEGER` through, after which `String(1e21)` emits
+ * scientific notation that Grok and Vibe both reject. The bounds below
+ * (safe-integer cap + 10000 ceiling for turns; finite + 10000 USD ceiling
+ * for price) guarantee a lossless decimal stringification AND a sane
+ * upper bound — no plausible single agent loop exceeds 10k turns or 10k USD.
+ */
+export declare const MAX_TURNS_SCHEMA: z.ZodNumber;
+export declare const MAX_PRICE_SCHEMA: z.ZodNumber;
 export declare const SESSION_PROVIDER_VALUES: readonly ["claude", "codex", "gemini", "grok", "mistral"];
 export declare const SESSION_PROVIDER_ENUM: z.ZodEnum<["claude", "codex", "gemini", "grok", "mistral"]>;
 export type SessionProvider = (typeof SESSION_PROVIDER_VALUES)[number];
@@ -81,6 +94,23 @@ interface GatewayServerRuntime {
     persistence: PersistenceConfig;
     cacheAwareness: CacheAwarenessConfig;
 }
+export declare function extractUsageAndCost(cli: "claude" | "codex" | "gemini" | "grok" | "mistral", output: string, outputFormat?: string,
+/**
+ * Optional context for off-stdout telemetry sources. Today only Mistral
+ * uses this — its meta.json lives on disk keyed by sessionId. Threading
+ * this in keeps the closure built by `buildAsyncFlightRecorderHandoff`
+ * primitives-only (no `params`/`prep` retention on AsyncJobRecord).
+ */
+ctx?: {
+    sessionId?: string;
+    home?: string;
+}): {
+    inputTokens?: number;
+    outputTokens?: number;
+    cacheReadTokens?: number;
+    cacheCreationTokens?: number;
+    costUsd?: number;
+};
 interface CliRequestPrep {
     corrId: string;
     effectivePrompt: string;
@@ -191,6 +221,35 @@ export declare function prepareGeminiRequest(params: {
     policyFiles?: string[];
     adminPolicyFiles?: string[];
     attachments?: string[];
+    /**
+     * Phase 4 slice γ: emit `--skip-trust` so first-run workspaces don't
+     * block headless invocations on the interactive trust prompt. Default
+     * is undefined (preserves current prompt behaviour for legacy callers).
+     */
+    skipTrust?: boolean;
+}, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
+export declare function prepareGrokRequest(params: {
+    prompt?: string;
+    promptParts?: PromptParts;
+    model?: string;
+    outputFormat?: string;
+    alwaysApprove?: boolean;
+    permissionMode?: string;
+    effort?: string;
+    reasoningEffort?: string;
+    allowedTools?: string[];
+    disallowedTools?: string[];
+    approvalStrategy: "legacy" | "mcp_managed";
+    approvalPolicy?: string;
+    mcpServers?: ClaudeMcpServerName[];
+    correlationId?: string;
+    optimizePrompt: boolean;
+    operation: string;
+    /**
+     * Phase 4 slice δ: emit `--max-turns N` so callers can cap agent-loop
+     * iterations for cost / latency control. Mirrors Claude's wiring.
+     */
+    maxTurns?: number;
 }, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
 export declare function prepareMistralRequest(params: {
     prompt?: string;
@@ -208,9 +267,34 @@ export declare function prepareMistralRequest(params: {
     correlationId?: string;
     optimizePrompt: boolean;
     operation: string;
+    /**
+     * Phase 4 slice γ: emit `--trust` to bypass Vibe's interactive trust
+     * prompt for this invocation only (not persisted). Default undefined.
+     */
+    trust?: boolean;
+    /** Phase 4 slice δ: Vibe `--max-turns N` cap on agent-loop iterations. */
+    maxTurns?: number;
+    /** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
+    maxPrice?: number;
 }, runtime?: GatewayServerRuntime): (CliRequestPrep & {
     mistralEnv: Record<string, string>;
 }) | ExtendedToolResponse;
+/**
+ * Phase 4 slice δ post-review: pure helper extracted from
+ * `handleMistralRequest` so the retry-path arg-preservation invariants
+ * (trust + maxTurns + maxPrice from slices γ/δ) are unit-testable
+ * without mocking awaitJobOrDefer. Any param the wrapper threads into
+ * the FIRST `buildMistralCliInvocation` call MUST also be threaded
+ * through here, or a fresh-workspace / budgeted run can degrade on
+ * the second attempt.
+ */
+export declare function buildMistralRetryPrep(params: Pick<MistralRequestParams, "outputFormat" | "permissionMode" | "effort" | "reasoningEffort" | "allowedTools" | "disallowedTools" | "approvalStrategy" | "trust" | "maxTurns" | "maxPrice"> & {
+    effectivePrompt: string;
+}, recoveryModel: string): {
+    args: string[];
+    env: Record<string, string>;
+    ignoredDisallowedTools: boolean;
+};
 export interface GeminiRequestParams {
     prompt?: string;
     promptParts?: PromptParts;
@@ -235,6 +319,8 @@ export interface GeminiRequestParams {
     policyFiles?: string[];
     adminPolicyFiles?: string[];
     attachments?: string[];
+    /** Phase 4 slice γ: emit `--skip-trust` for fresh-workspace headless runs. */
+    skipTrust?: boolean;
 }
 export interface HandlerDeps {
     sessionManager: ISessionManager;
@@ -273,6 +359,8 @@ export interface GrokRequestParams {
     optimizeResponse?: boolean;
     idleTimeoutMs?: number;
     forceRefresh?: boolean;
+    /** Phase 4 slice δ: cap agent-loop iterations via `--max-turns N`. */
+    maxTurns?: number;
 }
 export declare function handleGrokRequest(deps: HandlerDeps, params: GrokRequestParams): Promise<ExtendedToolResponse>;
 export declare function handleGrokRequestAsync(deps: AsyncHandlerDeps, params: Omit<GrokRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;
@@ -297,6 +385,12 @@ export interface MistralRequestParams {
     optimizeResponse?: boolean;
     idleTimeoutMs?: number;
     forceRefresh?: boolean;
+    /** Phase 4 slice γ: emit `--trust` for fresh-workspace headless runs. */
+    trust?: boolean;
+    /** Phase 4 slice δ: Vibe `--max-turns N` cap on agent-loop iterations. */
+    maxTurns?: number;
+    /** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
+    maxPrice?: number;
 }
 export declare function handleMistralRequest(deps: HandlerDeps, params: MistralRequestParams): Promise<ExtendedToolResponse>;
 export declare function handleMistralRequestAsync(deps: AsyncHandlerDeps, params: Omit<MistralRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;

package/dist/index.js CHANGED Viewed

@@ -10,6 +10,8 @@ import { executeCli, killAllProcessGroups } from "./executor.js";
 import { parseStreamJson } from "./stream-json-parser.js";
 import { parseCodexJsonStream } from "./codex-json-parser.js";
 import { parseGeminiJson } from "./gemini-json-parser.js";
+import { parseVibeMetaJson } from "./mistral-meta-json-parser.js";
+import { homedir } from "os";
 import { createSessionManager } from "./session-manager.js";
 import { ResourceProvider } from "./resources.js";
 import { PerformanceMetrics } from "./metrics.js";
@@ -227,6 +229,23 @@ function getApprovalManager(runtimeLogger = logger) {
     return approvalManager;
 }
 const MCP_SERVER_ENUM = z.enum(CLAUDE_MCP_SERVER_NAMES);
+/**
+ * Phase 4 slice δ — shared Zod fragments for `maxTurns` / `maxPrice`.
+ *
+ * Both flags reach the upstream CLIs as decimal-formatted argv strings via
+ * `String(N)`. `z.number().int().positive()` alone lets values past
+ * `Number.MAX_SAFE_INTEGER` through, after which `String(1e21)` emits
+ * scientific notation that Grok and Vibe both reject. The bounds below
+ * (safe-integer cap + 10000 ceiling for turns; finite + 10000 USD ceiling
+ * for price) guarantee a lossless decimal stringification AND a sane
+ * upper bound — no plausible single agent loop exceeds 10k turns or 10k USD.
+ */
+export const MAX_TURNS_SCHEMA = z.number().int().positive().safe().max(10_000);
+// `.min(1e-6)` keeps the value in JS's decimal-stringify range:
+// String(1e-6) === "0.000001" but String(1e-7) === "1e-7", which both
+// upstream CLIs would reject. 1µUSD per request is fine-grained enough
+// for any plausible budget-cap use.
+export const MAX_PRICE_SCHEMA = z.number().positive().finite().min(1e-6).max(10_000);
 // U22: Session-provider enum extended to five providers. The storage layer's
 // CLI_TYPES already includes "mistral"; the MCP-tool layer mirrors that here so
 // session_create / session_list / session_clear_all accept the fifth provider.
@@ -477,7 +496,14 @@ function createErrorResponse(cli, code, stderr, correlationId, error) {
         },
     };
 }
-function extractUsageAndCost(cli, output, outputFormat) {
+export function extractUsageAndCost(cli, output, outputFormat,
+/**
+ * Optional context for off-stdout telemetry sources. Today only Mistral
+ * uses this — its meta.json lives on disk keyed by sessionId. Threading
+ * this in keeps the closure built by `buildAsyncFlightRecorderHandoff`
+ * primitives-only (no `params`/`prep` retention on AsyncJobRecord).
+ */
+ctx) {
     if (cli === "claude" && outputFormat === "stream-json") {
         const parsed = parseStreamJson(output);
         if (!parsed.usage) {
@@ -515,9 +541,14 @@ function extractUsageAndCost(cli, output, outputFormat) {
             cacheReadTokens: parsed.usage.cache_read_tokens,
         };
     }
-    // Mistral/Vibe: does not surface usage in its stdout/stream-json output. A
-    // future unit can read it from `~/.vibe/logs/session/<id>/metadata.json`
-    // once we resolve the session id post-run.
+    // Mistral/Vibe: usage/cost live on disk in `~/.vibe/logs/session/<id>/meta.json`
+    // (Phase 4 slice β). Best-effort: if we don't know the sessionId (fresh
+    // session whose Vibe-assigned UUID we never observed) or the file is
+    // missing/malformed, the parser returns `{}` and the FR row simply lacks
+    // usage data — matching pre-slice behaviour. No stdout fallback exists.
+    if (cli === "mistral") {
+        return parseVibeMetaJson(ctx?.home ?? homedir(), ctx?.sessionId);
+    }
     return {};
 }
 /**
@@ -530,9 +561,13 @@ function extractUsageAndCost(cli, output, outputFormat) {
 function buildAsyncFlightRecorderHandoff(cliName, prep, sessionId, outputFormat) {
     // Extract primitives BEFORE building the closure — capturing `prep` or
     // `params` directly would pin large attachments / promptParts on the
-    // AsyncJobRecord for JOB_TTL_MS.
+    // AsyncJobRecord for JOB_TTL_MS. Phase 4 slice β: `sid` and `home` are
+    // primitives too, threaded through so the Mistral branch of
+    // extractUsageAndCost can read `~/.vibe/logs/session/<id>/meta.json`.
     const cli = cliName;
     const fmt = outputFormat;
+    const sid = sessionId;
+    const home = homedir();
     return {
         flightRecorderEntry: {
             model: prep.resolvedModel || "default",
@@ -541,7 +576,7 @@ function buildAsyncFlightRecorderHandoff(cliName, prep, sessionId, outputFormat)
             stablePrefixHash: prep.stablePrefixHash ?? undefined,
             stablePrefixTokens: prep.stablePrefixTokens ?? undefined,
         },
-        extractUsage: (stdout) => extractUsageAndCost(cli, stdout, fmt),
+        extractUsage: (stdout) => extractUsageAndCost(cli, stdout, fmt, { sessionId: sid, home }),
     };
 }
 function safeFlightStart(entry, runtime = resolveGatewayServerRuntime()) {
@@ -1081,11 +1116,12 @@ export function prepareCodexRequest(params, runtime = resolveGatewayServerRuntim
         args.push("--json");
     }
     args.push("--skip-git-repo-check");
-    // U26: High-impact feature flags. Some of these (`--output-schema`,
-    // `--search`, `-C`, `--add-dir`) are rejected by `codex exec resume`, so we
-    // only emit them on a NEW session. Images / ephemeral / profile /
-    // ignore-rules / ignore-user-config are allowed on resume per the audited
-    // CLI help; we emit them in both branches.
+    // U26: High-impact feature flags. `--search` is rejected by
+    // `codex exec resume` (resume inherits the original session's web-search
+    // state), so we only emit it on a NEW session. `--output-schema`,
+    // `-c key=value`, profile, ephemeral, images, and the ignore-* flags are
+    // all accepted on resume per `codex exec resume --help` (codex-cli 0.133.0)
+    // and are emitted in both branches.
     let highImpactCleanup;
     if (sessionPlan.mode === "new") {
         const high = prepareCodexHighImpactFlags({
@@ -1105,12 +1141,10 @@ export function prepareCodexRequest(params, runtime = resolveGatewayServerRuntim
         highImpactCleanup = high.cleanup;
     }
     else {
-        // On resume, emit only the resume-safe subset (profile, ephemeral,
-        // images, ignoreUserConfig, ignoreRules). outputSchema, search, and
-        // configOverrides are dropped silently to mirror existing behavior for
-        // sandbox/ask-for-approval on resume.
         const high = prepareCodexHighImpactFlags({
+            outputSchema: params.outputSchema,
             profile: params.profile,
+            configOverrides: params.configOverrides,
             ephemeral: params.ephemeral,
             images: params.images,
             ignoreUserConfig: params.ignoreUserConfig,
@@ -1240,6 +1274,10 @@ export function prepareGeminiRequest(params, runtime = resolveGatewayServerRunti
     if (params.outputFormat === "json") {
         args.push("-o", "json");
     }
+    // Phase 4 slice γ: opt-in trust-prompt bypass for fresh workspaces.
+    if (params.skipTrust) {
+        args.push("--skip-trust");
+    }
     return {
         corrId,
         effectivePrompt,
@@ -1252,7 +1290,7 @@ export function prepareGeminiRequest(params, runtime = resolveGatewayServerRunti
         stablePrefixTokens,
     };
 }
-function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime()) {
+export function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime()) {
     const corrId = params.correlationId || randomUUID();
     const cliInfo = getCliInfo();
     const resolvedModel = resolveModelAlias("grok", params.model, cliInfo);
@@ -1328,6 +1366,9 @@ function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime()) {
     if (params.disallowedTools && params.disallowedTools.length > 0) {
         args.push("--disallowed-tools", params.disallowedTools.join(","));
     }
+    if (params.maxTurns !== undefined) {
+        args.push("--max-turns", String(params.maxTurns));
+    }
     return {
         corrId,
         effectivePrompt,
@@ -1411,6 +1452,9 @@ export function prepareMistralRequest(params, runtime = resolveGatewayServerRunt
         reasoningEffort: params.reasoningEffort,
         allowedTools: params.allowedTools,
         disallowedTools: params.disallowedTools,
+        trust: params.trust,
+        maxTurns: params.maxTurns,
+        maxPrice: params.maxPrice,
     });
     if (prep.ignoredDisallowedTools) {
         runtime.logger.info(`[${corrId}] Mistral does not support disallowedTools; ignoring (caller passed ${params.disallowedTools?.length ?? 0} entries)`);
@@ -1441,6 +1485,32 @@ function selectMistralRecoveryModel(failedModel) {
     ].filter((model) => Boolean(model && model !== failedModel));
     return candidates.find(model => model !== "local");
 }
+/**
+ * Phase 4 slice δ post-review: pure helper extracted from
+ * `handleMistralRequest` so the retry-path arg-preservation invariants
+ * (trust + maxTurns + maxPrice from slices γ/δ) are unit-testable
+ * without mocking awaitJobOrDefer. Any param the wrapper threads into
+ * the FIRST `buildMistralCliInvocation` call MUST also be threaded
+ * through here, or a fresh-workspace / budgeted run can degrade on
+ * the second attempt.
+ */
+export function buildMistralRetryPrep(params, recoveryModel) {
+    return buildMistralCliInvocation({
+        prompt: params.effectivePrompt,
+        resolvedModel: recoveryModel,
+        outputFormat: params.outputFormat,
+        permissionMode: params.approvalStrategy === "mcp_managed"
+            ? "auto-approve"
+            : (params.permissionMode ?? "auto-approve"),
+        effort: params.effort,
+        reasoningEffort: params.reasoningEffort,
+        allowedTools: params.allowedTools,
+        disallowedTools: params.disallowedTools,
+        trust: params.trust,
+        maxTurns: params.maxTurns,
+        maxPrice: params.maxPrice,
+    });
+}
 function buildCliResponse(cli, stdout, optimizeResponse, corrId, sessionId, prep, durationMs, resumable, outputFormat, warnings) {
     let finalStdout = stdout;
     // Skip response optimization for JSON output to prevent corrupting structured data
@@ -1466,7 +1536,10 @@ function buildCliResponse(cli, stdout, optimizeResponse, corrId, sessionId, prep
             correlationId: corrId,
             sessionId: sessionId || null,
             durationMs,
-            ...extractUsageAndCost(cli, stdout, outputFormat),
+            // Phase 4 slice β: thread sessionId + home so the Mistral branch of
+            // extractUsageAndCost can read `~/.vibe/logs/session/<dir>/meta.json`.
+            // Other CLIs ignore the ctx (their usage source is stdout).
+            ...extractUsageAndCost(cli, stdout, outputFormat, { sessionId, home: homedir() }),
             exitCode: 0,
             retryCount: 0,
         },
@@ -1564,6 +1637,7 @@ export async function handleGeminiRequest(deps, params) {
         policyFiles: params.policyFiles,
         adminPolicyFiles: params.adminPolicyFiles,
         attachments: params.attachments,
+        skipTrust: params.skipTrust,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -1692,6 +1766,7 @@ export async function handleGeminiRequestAsync(deps, params) {
         policyFiles: params.policyFiles,
         adminPolicyFiles: params.adminPolicyFiles,
         attachments: params.attachments,
+        skipTrust: params.skipTrust,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -1774,6 +1849,7 @@ export async function handleGrokRequest(deps, params) {
         correlationId: params.correlationId,
         optimizePrompt: params.optimizePrompt,
         operation: "grok_request",
+        maxTurns: params.maxTurns,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -1894,6 +1970,7 @@ export async function handleGrokRequestAsync(deps, params) {
         correlationId: params.correlationId,
         optimizePrompt: params.optimizePrompt,
         operation: "grok_request_async",
+        maxTurns: params.maxTurns,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -1975,6 +2052,9 @@ export async function handleMistralRequest(deps, params) {
         correlationId: params.correlationId,
         optimizePrompt: params.optimizePrompt,
         operation: "mistral_request",
+        trust: params.trust,
+        maxTurns: params.maxTurns,
+        maxPrice: params.maxPrice,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -2007,18 +2087,7 @@ export async function handleMistralRequest(deps, params) {
             const recoveryModel = selectMistralRecoveryModel(prep.resolvedModel);
             if (recoveryModel) {
                 deps.logger.info(`[${corrId}] mistral_request detected stale Vibe model selection; retrying once with ${recoveryModel}`);
-                const retryPrep = buildMistralCliInvocation({
-                    prompt: prep.effectivePrompt,
-                    resolvedModel: recoveryModel,
-                    outputFormat: params.outputFormat,
-                    permissionMode: params.approvalStrategy === "mcp_managed"
-                        ? "auto-approve"
-                        : (params.permissionMode ?? "auto-approve"),
-                    effort: params.effort,
-                    reasoningEffort: params.reasoningEffort,
-                    allowedTools: params.allowedTools,
-                    disallowedTools: params.disallowedTools,
-                });
+                const retryPrep = buildMistralRetryPrep({ ...params, effectivePrompt: prep.effectivePrompt }, recoveryModel);
                 const retryArgs = [...retryPrep.args, ...sessionResult.resumeArgs];
                 // Reuse the FR handoff built above — the retry preserves corrId,
                 // so the manager's logComplete still updates the original row.
@@ -2118,6 +2187,9 @@ export async function handleMistralRequestAsync(deps, params) {
         correlationId: params.correlationId,
         optimizePrompt: params.optimizePrompt,
         operation: "mistral_request_async",
+        trust: params.trust,
+        maxTurns: params.maxTurns,
+        maxPrice: params.maxPrice,
     }, runtime);
     if (!("args" in prep))
         return prep;
@@ -3006,7 +3078,11 @@ export function createGatewayServer(deps = {}) {
         policyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.policyFiles.describe("Policy file paths (--policy <path>, one per file). Paths must exist."),
         adminPolicyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.adminPolicyFiles.describe("Admin policy file paths (--admin-policy <path>, one per file). Paths must exist."),
         attachments: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.attachments.describe("Absolute file paths prepended as @<path> tokens to the prompt"),
-    }, async ({ prompt, promptParts, model, sessionId, resumeLatest, createNewSession, approvalMode, approvalStrategy, approvalPolicy, mcpServers, allowedTools, includeDirs, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, outputFormat, sandbox, policyFiles, adminPolicyFiles, attachments, }) => {
+        skipTrust: z
+            .boolean()
+            .default(false)
+            .describe("Emit `--skip-trust` so Gemini trusts the workspace for this session and skips the interactive trust prompt (Phase 4 slice γ). Required for headless runs in fresh workspaces."),
+    }, async ({ prompt, promptParts, model, sessionId, resumeLatest, createNewSession, approvalMode, approvalStrategy, approvalPolicy, mcpServers, allowedTools, includeDirs, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, outputFormat, sandbox, policyFiles, adminPolicyFiles, attachments, skipTrust, }) => {
         return handleGeminiRequest({ sessionManager, logger, runtime }, {
             prompt,
             promptParts,
@@ -3030,6 +3106,7 @@ export function createGatewayServer(deps = {}) {
             policyFiles,
             adminPolicyFiles,
             attachments,
+            skipTrust,
         });
     });
     //──────────────────────────────────────────────────────────────────────────────
@@ -3104,7 +3181,8 @@ export function createGatewayServer(deps = {}) {
             .boolean()
             .default(false)
             .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
+        maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
+    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, maxTurns, }) => {
         return handleGrokRequest({ sessionManager, logger, runtime }, {
             prompt,
             promptParts,
@@ -3127,6 +3205,7 @@ export function createGatewayServer(deps = {}) {
             optimizeResponse,
             idleTimeoutMs,
             forceRefresh,
+            maxTurns,
         });
     });
     //──────────────────────────────────────────────────────────────────────────────
@@ -3200,7 +3279,13 @@ export function createGatewayServer(deps = {}) {
             .boolean()
             .default(false)
             .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, }) => {
+        trust: z
+            .boolean()
+            .default(false)
+            .describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
+        maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
+        maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
+    }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
         return handleMistralRequest({ sessionManager, logger, runtime }, {
             prompt,
             promptParts,
@@ -3222,6 +3307,9 @@ export function createGatewayServer(deps = {}) {
             optimizeResponse,
             idleTimeoutMs,
             forceRefresh,
+            trust,
+            maxTurns,
+            maxPrice,
         });
     });
     //──────────────────────────────────────────────────────────────────────────────
@@ -3612,7 +3700,11 @@ export function createGatewayServer(deps = {}) {
             policyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.policyFiles.describe("Policy file paths (--policy <path>, one per file). Paths must exist."),
             adminPolicyFiles: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.adminPolicyFiles.describe("Admin policy file paths (--admin-policy <path>, one per file). Paths must exist."),
             attachments: GEMINI_HIGH_IMPACT_PARAMS_SCHEMA.shape.attachments.describe("Absolute file paths prepended as @<path> tokens to the prompt"),
-        }, async ({ prompt, promptParts, model, sessionId, resumeLatest, createNewSession, approvalMode, approvalStrategy, approvalPolicy, mcpServers, allowedTools, includeDirs, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, outputFormat, sandbox, policyFiles, adminPolicyFiles, attachments, }) => {
+            skipTrust: z
+                .boolean()
+                .default(false)
+                .describe("Emit `--skip-trust` so Gemini trusts the workspace for this session and skips the interactive trust prompt (Phase 4 slice γ). Required for headless runs in fresh workspaces."),
+        }, async ({ prompt, promptParts, model, sessionId, resumeLatest, createNewSession, approvalMode, approvalStrategy, approvalPolicy, mcpServers, allowedTools, includeDirs, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, outputFormat, sandbox, policyFiles, adminPolicyFiles, attachments, skipTrust, }) => {
             return handleGeminiRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
                 prompt,
                 promptParts,
@@ -3635,6 +3727,7 @@ export function createGatewayServer(deps = {}) {
                 policyFiles,
                 adminPolicyFiles,
                 attachments,
+                skipTrust,
             });
         });
         server.tool("grok_request_async", {
@@ -3705,7 +3798,8 @@ export function createGatewayServer(deps = {}) {
                 .boolean()
                 .default(false)
                 .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
+            maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
+        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, maxTurns, }) => {
             return handleGrokRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
                 prompt,
                 promptParts,
@@ -3727,6 +3821,7 @@ export function createGatewayServer(deps = {}) {
                 optimizePrompt,
                 idleTimeoutMs,
                 forceRefresh,
+                maxTurns,
             });
         });
         server.tool("mistral_request_async", {
@@ -3796,7 +3891,13 @@ export function createGatewayServer(deps = {}) {
                 .boolean()
                 .default(false)
                 .describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
-        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, }) => {
+            trust: z
+                .boolean()
+                .default(false)
+                .describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
+            maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
+            maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
+        }, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
             return handleMistralRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
                 prompt,
                 promptParts,
@@ -3817,6 +3918,9 @@ export function createGatewayServer(deps = {}) {
                 optimizePrompt,
                 idleTimeoutMs,
                 forceRefresh,
+                trust,
+                maxTurns,
+                maxPrice,
             });
         });
         server.tool("llm_job_status", {

package/dist/mistral-meta-json-parser.d.ts ADDED Viewed

@@ -0,0 +1,6 @@
+export interface VibeMetaJsonUsage {
+    inputTokens?: number;
+    outputTokens?: number;
+    costUsd?: number;
+}
+export declare function parseVibeMetaJson(home: string, sessionId: string | undefined): VibeMetaJsonUsage;

package/dist/mistral-meta-json-parser.js ADDED Viewed

@@ -0,0 +1,175 @@
+/**
+ * Phase 4 slice β — Mistral Vibe `meta.json` parser.
+ *
+ * Vibe writes per-session telemetry to
+ *
+ *   ~/.vibe/logs/session/session_<YYYYMMDD>_<HHMMSS>_<first8hex>/meta.json
+ *
+ * where `<first8hex>` is the first 8 lowercase hex characters of the full
+ * session UUID. Inside the file:
+ *
+ *   {
+ *     "session_id": "<full-uuid>",
+ *     "stats": {
+ *       "session_prompt_tokens":      <number>  → inputTokens
+ *       "session_completion_tokens":  <number>  → outputTokens
+ *       "session_cost":               <number>  → costUsd
+ *     }
+ *   }
+ *
+ * The gateway's mistral session-id surface accepts the full UUID (so does
+ * `vibe --resume <uuid>`). To find the right directory we glob for
+ * `session_*_<first8>` and disambiguate by reading each candidate's
+ * `session_id` field. If callers happen to pass the directory basename
+ * itself we still honour that — useful for tests and for forward-compat if
+ * Vibe ever changes its dir naming scheme.
+ *
+ * Cache-token surfaces are not exposed by Vibe today, so `cacheReadTokens`
+ * and `cacheCreationTokens` are intentionally absent.
+ *
+ * Best-effort by design: any failure (missing file, bad JSON, missing
+ * fields, gateway-generated `gw-*` sessionId, unresolvable UUID, path
+ * outside the session log root) returns `{}` so the flight-recorder row
+ * simply lacks usage data.
+ */
+import { existsSync, readdirSync, readFileSync, realpathSync, statSync } from "fs";
+import { join, resolve, sep } from "path";
+import { GATEWAY_SESSION_PREFIX } from "./request-helpers.js";
+function asPositiveNumber(value) {
+    if (typeof value !== "number" || !Number.isFinite(value) || value < 0) {
+        return undefined;
+    }
+    return value;
+}
+/**
+ * Read a file only if its realpath lives under `realBase`. Returns undefined
+ * on any error, missing file, or out-of-tree symlink target. This is the one
+ * place that calls `readFileSync` for meta.json content — the rest of the
+ * module routes through it so the security boundary is uniform.
+ */
+function readInBase(realBase, candidate) {
+    if (!existsSync(candidate))
+        return undefined;
+    let realCandidate;
+    try {
+        realCandidate = realpathSync(candidate);
+    }
+    catch {
+        return undefined;
+    }
+    const realBaseWithSep = realBase.endsWith(sep) ? realBase : realBase + sep;
+    if (!realCandidate.startsWith(realBaseWithSep))
+        return undefined;
+    try {
+        return readFileSync(realCandidate, "utf-8");
+    }
+    catch {
+        return undefined;
+    }
+}
+// UUID v4-ish (Vibe's own session UUIDs are not strictly v4, so we
+// validate against the broader 8-4-4-4-12 lowercase-hex shape) OR
+// Vibe's session_<digits>_<digits>_<first8> directory basename.
+const UUID_RE = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+const DIRNAME_RE = /^session_\d{8}_\d{6}_[0-9a-f]{8}$/;
+/**
+ * Resolve the session-log directory basename for a given gateway sessionId.
+ * Returns undefined when no candidate can be found or the input is
+ * unsuitable. Pure with respect to side-effects on the caller — only reads
+ * the filesystem.
+ *
+ * Security invariants enforced here:
+ *   - Inputs are charset-gated (UUID or DIRNAME) before any filesystem read.
+ *   - For UUID input, the chosen candidate's meta.json MUST advertise the
+ *     same `session_id` — single-candidate is NOT trusted, because two
+ *     UUIDs sharing the first 8 hex chars would otherwise cross-attribute
+ *     usage (and leak telemetry to the caller of the other session).
+ */
+function resolveVibeSessionDirname(baseDir, realBase, sessionId) {
+    // 1. Caller already supplied the directory name verbatim.
+    if (DIRNAME_RE.test(sessionId) && existsSync(join(baseDir, sessionId, "meta.json"))) {
+        return sessionId;
+    }
+    // 2. Treat the input as a full session UUID.
+    if (!UUID_RE.test(sessionId))
+        return undefined;
+    const short = sessionId.slice(0, 8).toLowerCase();
+    let entries;
+    try {
+        entries = readdirSync(baseDir);
+    }
+    catch {
+        return undefined;
+    }
+    // Filter to candidates matching `session_*_<short>`. Sort newest-first
+    // by mtime; we still require an exact session_id match below.
+    const candidates = entries
+        .filter(name => DIRNAME_RE.test(name) && name.endsWith(`_${short}`))
+        .map(name => {
+        let mtimeMs = 0;
+        try {
+            mtimeMs = statSync(join(baseDir, name)).mtimeMs;
+        }
+        catch {
+            /* ignore */
+        }
+        return { name, mtimeMs };
+    })
+        .sort((a, b) => b.mtimeMs - a.mtimeMs);
+    for (const { name } of candidates) {
+        const text = readInBase(realBase, join(baseDir, name, "meta.json"));
+        if (text === undefined)
+            continue;
+        try {
+            const parsed = JSON.parse(text);
+            if (typeof parsed.session_id === "string" && parsed.session_id === sessionId) {
+                return name;
+            }
+        }
+        catch {
+            /* ignore and continue */
+        }
+    }
+    return undefined;
+}
+export function parseVibeMetaJson(home, sessionId) {
+    if (!sessionId)
+        return {};
+    if (sessionId.startsWith(GATEWAY_SESSION_PREFIX)) {
+        // gw-* IDs are gateway internal — Vibe never wrote a meta.json under that name.
+        return {};
+    }
+    const baseDir = resolve(join(home, ".vibe", "logs", "session"));
+    let realBase;
+    try {
+        realBase = realpathSync(baseDir);
+    }
+    catch {
+        return {};
+    }
+    const dirname = resolveVibeSessionDirname(baseDir, realBase, sessionId);
+    if (!dirname)
+        return {};
+    // `readInBase` is the security boundary: it realpath-resolves the file
+    // and rejects anything whose target lives outside `realBase`. Re-routing
+    // the final read through it (instead of a bespoke readFileSync) keeps
+    // the in-tree-only invariant in one place.
+    const text = readInBase(realBase, join(baseDir, dirname, "meta.json"));
+    if (text === undefined)
+        return {};
+    let raw;
+    try {
+        raw = JSON.parse(text);
+    }
+    catch {
+        return {};
+    }
+    const stats = raw?.stats;
+    if (!stats || typeof stats !== "object")
+        return {};
+    return {
+        inputTokens: asPositiveNumber(stats.session_prompt_tokens),
+        outputTokens: asPositiveNumber(stats.session_completion_tokens),
+        costUsd: asPositiveNumber(stats.session_cost),
+    };
+}

package/dist/request-helpers.d.ts CHANGED Viewed

@@ -107,6 +107,24 @@ export interface PrepareMistralRequestInput {
      * emit a `logger.warn` when this is non-empty.
      */
     disallowedTools?: string[];
+    /**
+     * Phase 4 slice γ: emit `--trust` so non-interactive runs in fresh
+     * workspaces skip Vibe's interactive trust prompt for this invocation
+     * only (not persisted to `trusted_folders.toml`). Default undefined →
+     * Vibe's prompt behaviour is preserved for existing callers.
+     */
+    trust?: boolean;
+    /**
+     * Phase 4 slice δ: emit `--max-turns N` to cap the agent-loop iteration
+     * count (only applies in programmatic mode with `-p`).
+     */
+    maxTurns?: number;
+    /**
+     * Phase 4 slice δ: emit `--max-price DOLLARS` so the session is
+     * interrupted when cumulative cost crosses the cap (programmatic mode
+     * only).
+     */
+    maxPrice?: number;
 }
 export interface PrepareMistralRequestResult {
     args: string[];
@@ -204,9 +222,11 @@ export declare function resolveCodexSandboxFlags(input: CodexSandboxFlagsInput):
  * Flags that `codex exec resume` rejects (the original session's policy is
  * inherited). Callers must drop these when building resume argv.
  *
- * U26 expands this list with `--add-dir`, `-C`, `--output-schema`, and
- * `--search`, all of which `codex exec resume --help` rejects at the audit
- * date.
+ * Verified against `codex exec resume --help` (codex-cli 0.133.0):
+ * `--full-auto`, `--sandbox`, `--ask-for-approval`, `--add-dir`, `-C`, and
+ * `--search` are rejected. `--output-schema` and `-c key=value` ARE accepted
+ * on resume and therefore are NOT in this filter (Phase 4 slice α restored
+ * the previously-silent drop of those two).
  */
 export declare const CODEX_RESUME_FILTERED_FLAGS: ReadonlySet<string>;
 /**
@@ -398,8 +418,8 @@ export declare const CODEX_HIGH_IMPACT_PARAMS_SCHEMA: z.ZodObject<{
     ignoreRules: z.ZodOptional<z.ZodBoolean>;
 }, "strip", z.ZodTypeAny, {
     search?: boolean | undefined;
-    profile?: string | undefined;
     outputSchema?: string | Record<string, unknown> | undefined;
+    profile?: string | undefined;
     configOverrides?: Record<string, string> | undefined;
     ephemeral?: boolean | undefined;
     images?: string[] | undefined;
@@ -407,8 +427,8 @@ export declare const CODEX_HIGH_IMPACT_PARAMS_SCHEMA: z.ZodObject<{
     ignoreRules?: boolean | undefined;
 }, {
     search?: boolean | undefined;
-    profile?: string | undefined;
     outputSchema?: string | Record<string, unknown> | undefined;
+    profile?: string | undefined;
     configOverrides?: Record<string, string> | undefined;
     ephemeral?: boolean | undefined;
     images?: string[] | undefined;

package/dist/request-helpers.js CHANGED Viewed

@@ -176,6 +176,15 @@ export function prepareMistralRequest(input) {
             args.push("--enabled-tools", tool);
         }
     }
+    if (input.trust) {
+        args.push("--trust");
+    }
+    if (input.maxTurns !== undefined) {
+        args.push("--max-turns", String(input.maxTurns));
+    }
+    if (input.maxPrice !== undefined) {
+        args.push("--max-price", String(input.maxPrice));
+    }
     const ignoredDisallowedTools = Boolean(input.disallowedTools && input.disallowedTools.length > 0);
     return { args, env, ignoredDisallowedTools };
 }
@@ -279,9 +288,11 @@ export function resolveCodexSandboxFlags(input) {
  * Flags that `codex exec resume` rejects (the original session's policy is
  * inherited). Callers must drop these when building resume argv.
  *
- * U26 expands this list with `--add-dir`, `-C`, `--output-schema`, and
- * `--search`, all of which `codex exec resume --help` rejects at the audit
- * date.
+ * Verified against `codex exec resume --help` (codex-cli 0.133.0):
+ * `--full-auto`, `--sandbox`, `--ask-for-approval`, `--add-dir`, `-C`, and
+ * `--search` are rejected. `--output-schema` and `-c key=value` ARE accepted
+ * on resume and therefore are NOT in this filter (Phase 4 slice α restored
+ * the previously-silent drop of those two).
  */
 export const CODEX_RESUME_FILTERED_FLAGS = new Set([
     "--full-auto",
@@ -289,7 +300,6 @@ export const CODEX_RESUME_FILTERED_FLAGS = new Set([
     "--ask-for-approval",
     "--add-dir",
     "-C",
-    "--output-schema",
     "--search",
 ]);
 /**
@@ -301,7 +311,6 @@ const CODEX_RESUME_FILTERED_FLAGS_WITH_VALUE = new Set([
     "--ask-for-approval",
     "--add-dir",
     "-C",
-    "--output-schema",
 ]);
 /**
  * Strip resume-incompatible flag/value pairs from a Codex argv segment.

package/dist/upstream-contracts.js CHANGED Viewed

@@ -133,14 +133,11 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "ignoreRules",
         ],
         resumeOnlyFlags: ["--last"],
-        resumeForbiddenFlags: [
-            "--sandbox",
-            "--ask-for-approval",
-            "--full-auto",
-            "--output-schema",
-            "--search",
-            "-c",
-        ],
+        // Phase 4 slice α (v1.8.0) verified that `codex exec resume` accepts
+        // `--output-schema` and `-c` (codex-cli 0.133.0 `exec resume --help`),
+        // so they're no longer forbidden. `--search` stays forbidden (resume
+        // inherits the original session's web-search state).
+        resumeForbiddenFlags: ["--sandbox", "--ask-for-approval", "--full-auto", "--search"],
         flags: {
             "--last": { arity: "none", description: "Resume latest session" },
             "--model": { arity: "one", description: "Model selector" },
@@ -189,9 +186,24 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 expect: "fail",
             },
             {
+                // Phase 4 slice α: --output-schema IS accepted on resume per
+                // codex-cli 0.133.0; this fixture pins the new behaviour so future
+                // contract changes can't silently regress.
                 id: "codex-resume-output-schema",
-                description: "Resume-incompatible output schema flag is rejected",
+                description: "Phase 4 slice α: --output-schema accepted on resume (codex-cli 0.133.0)",
                 args: ["exec", "resume", "--output-schema", "/tmp/schema.json", "session-id", "hello"],
+                expect: "pass",
+            },
+            {
+                id: "codex-resume-config-override",
+                description: "Phase 4 slice α: -c key=value accepted on resume",
+                args: ["exec", "resume", "-c", "model.foo=bar", "session-id", "hello"],
+                expect: "pass",
+            },
+            {
+                id: "codex-resume-search-still-forbidden",
+                description: "Phase 4 slice α: --search remains forbidden on resume",
+                args: ["exec", "resume", "--search", "session-id", "hello"],
                 expect: "fail",
             },
         ],
@@ -219,6 +231,8 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "policyFiles",
             "adminPolicyFiles",
             "attachments",
+            // Phase 4 slice γ
+            "skipTrust",
         ],
         flags: {
             "-p": { arity: "one", description: "Prompt text" },
@@ -236,6 +250,10 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "--admin-policy": { arity: "one", description: "Admin policy file path" },
             "-o": { arity: "one", values: ["json"], description: "Output format" },
             "--resume": { arity: "one", description: "Resume session" },
+            "--skip-trust": {
+                arity: "none",
+                description: "Trust workspace for this session (Phase 4 slice γ)",
+            },
         },
         env: {},
         conformanceFixtures: [
@@ -251,6 +269,12 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 args: ["-p", "hello", "--not-a-gemini-flag"],
                 expect: "fail",
             },
+            {
+                id: "gemini-skip-trust",
+                description: "Phase 4 slice γ: --skip-trust is accepted",
+                args: ["-p", "hello", "--skip-trust"],
+                expect: "pass",
+            },
         ],
     },
     grok: {
@@ -275,6 +299,8 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "mcpServers",
             "allowedTools",
             "disallowedTools",
+            // Phase 4 slice δ
+            "maxTurns",
         ],
         flags: {
             "-p": { arity: "one", description: "Prompt text" },
@@ -299,6 +325,11 @@ export const UPSTREAM_CLI_CONTRACTS = {
             },
             "--resume": { arity: "one", description: "Resume session" },
             "--continue": { arity: "none", description: "Continue latest session" },
+            "--max-turns": {
+                arity: "one",
+                pattern: /^[1-9][0-9]*$/,
+                description: "Agent-loop iteration cap (Phase 4 slice δ)",
+            },
         },
         env: {},
         conformanceFixtures: [
@@ -314,6 +345,18 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 args: ["-p", "hello", "--not-a-grok-flag"],
                 expect: "fail",
             },
+            {
+                id: "grok-max-turns",
+                description: "Phase 4 slice δ: --max-turns N is accepted",
+                args: ["-p", "hello", "--max-turns", "5"],
+                expect: "pass",
+            },
+            {
+                id: "grok-max-turns-invalid-zero",
+                description: "Phase 4 slice δ: --max-turns 0 is rejected by contract pattern",
+                args: ["-p", "hello", "--max-turns", "0"],
+                expect: "fail",
+            },
         ],
     },
     mistral: {
@@ -337,6 +380,11 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "mcpServers",
             "allowedTools",
             "disallowedTools",
+            // Phase 4 slice γ
+            "trust",
+            // Phase 4 slice δ
+            "maxTurns",
+            "maxPrice",
         ],
         flags: {
             "-p": { arity: "one", description: "Prompt text" },
@@ -355,6 +403,22 @@ export const UPSTREAM_CLI_CONTRACTS = {
             "--enabled-tools": { arity: "one", description: "Enabled tool" },
             "--resume": { arity: "one", description: "Resume session" },
             "--continue": { arity: "none", description: "Continue latest session" },
+            "--trust": {
+                arity: "none",
+                description: "Trust cwd for this invocation only (Phase 4 slice γ)",
+            },
+            "--max-turns": {
+                arity: "one",
+                pattern: /^[1-9][0-9]*$/,
+                description: "Agent-loop iteration cap (Phase 4 slice δ, programmatic mode only)",
+            },
+            "--max-price": {
+                arity: "one",
+                // Decimal-only: matches the MAX_PRICE_SCHEMA min(1e-6) lower bound
+                // that keeps String(N) in decimal form (no scientific notation).
+                pattern: /^(0|[1-9][0-9]*)(\.[0-9]+)?$/,
+                description: "Cumulative cost cap in USD (Phase 4 slice δ, programmatic mode only)",
+            },
         },
         env: {
             VIBE_ACTIVE_MODEL: {
@@ -378,6 +442,27 @@ export const UPSTREAM_CLI_CONTRACTS = {
                 env: { CODEX_MODEL: "gpt-5.5" },
                 expect: "fail",
             },
+            {
+                id: "mistral-trust",
+                description: "Phase 4 slice γ: --trust is accepted",
+                args: ["-p", "hello", "--agent", "auto-approve", "--trust"],
+                env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
+                expect: "pass",
+            },
+            {
+                id: "mistral-max-turns-and-price",
+                description: "Phase 4 slice δ: --max-turns + --max-price are accepted together",
+                args: ["-p", "hello", "--agent", "auto-approve", "--max-turns", "3", "--max-price", "0.01"],
+                env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
+                expect: "pass",
+            },
+            {
+                id: "mistral-max-price-scientific-notation",
+                description: "Phase 4 slice δ: scientific-notation --max-price is rejected by contract pattern (matches MAX_PRICE_SCHEMA bounds)",
+                args: ["-p", "hello", "--agent", "auto-approve", "--max-price", "1e-7"],
+                env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
+                expect: "fail",
+            },
         ],
     },
 };

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "llm-cli-gateway",
-  "version": "1.7.0",
+  "version": "1.9.0",
   "mcpName": "io.github.verivus-oss/llm-cli-gateway",
   "description": "MCP server providing unified access to Claude Code, Codex, Gemini, Grok, and Mistral Vibe CLIs with session management, retry logic, async job orchestration, durable job results, and cross-LLM validation.",
   "license": "MIT",