llm-cli-gateway 1.8.0 → 1.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +98 -0
- package/dist/index.d.ts +62 -0
- package/dist/index.js +72 -21
- package/dist/request-helpers.d.ts +11 -0
- package/dist/request-helpers.js +6 -0
- package/dist/upstream-contracts.js +94 -9
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,104 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to the llm-cli-gateway project.
|
|
4
4
|
|
|
5
|
+
## [1.9.0] - 2026-05-27 — Phase 4 slice δ (budget/max-turns parity) + retroactive α/γ contract closure
|
|
6
|
+
|
|
7
|
+
Ships the fourth Phase 4 slice (budget/max-turns parity for Grok and Mistral),
|
|
8
|
+
and retroactively closes three latent contract gaps that shipped silently in
|
|
9
|
+
v1.8.0 (slices α and γ). Five commits land together: the slice δ feature,
|
|
10
|
+
two bounds-tightening fixes, a contract-table closure, and a test-veracity
|
|
11
|
+
hardening pass driven by an iterative multi-LLM audit.
|
|
12
|
+
|
|
13
|
+
### Added — `maxTurns` / `maxPrice` budget caps (slice δ)
|
|
14
|
+
|
|
15
|
+
- `grok_request` and `grok_request_async` gain optional `maxTurns?: number`
|
|
16
|
+
→ emits `grok --max-turns N`. Grok exposes no per-request budget flag,
|
|
17
|
+
so `--max-price` is Mistral-only.
|
|
18
|
+
- `mistral_request` and `mistral_request_async` gain optional
|
|
19
|
+
`maxTurns?: number` → `vibe --max-turns N` AND `maxPrice?: number` →
|
|
20
|
+
`vibe --max-price DOLLARS`. Both apply only in programmatic mode (`-p`),
|
|
21
|
+
matching Vibe's documented constraint.
|
|
22
|
+
- The Mistral stale-model recovery retry path (extracted into a pure
|
|
23
|
+
`buildMistralRetryPrep` helper) preserves all three slice-γ/δ flags
|
|
24
|
+
(`trust`, `maxTurns`, `maxPrice`) on the second attempt.
|
|
25
|
+
- Defaults: undefined for all three new fields → no flag emitted →
|
|
26
|
+
existing callers see no behavioural change.
|
|
27
|
+
|
|
28
|
+
### Fixed — Bounded numeric schemas for lossless argv stringification
|
|
29
|
+
|
|
30
|
+
- Extracted two shared, exported Zod constants:
|
|
31
|
+
- `MAX_TURNS_SCHEMA = z.number().int().positive().safe().max(10_000)`
|
|
32
|
+
- `MAX_PRICE_SCHEMA = z.number().positive().finite().min(1e-6).max(10_000)`
|
|
33
|
+
- The lower `.min(1e-6)` cap on price is exactly the boundary where
|
|
34
|
+
`String(N)` switches from decimal to scientific notation
|
|
35
|
+
(`String(1e-6) === "0.000001"` but `String(1e-7) === "1e-7"`); both
|
|
36
|
+
upstream CLIs reject scientific-notation values.
|
|
37
|
+
- Reused across all four slice-δ tool registrations so bounds stay
|
|
38
|
+
consistent if they ever need to change.
|
|
39
|
+
|
|
40
|
+
### Fixed — Upstream contract table closes 5 latent flag gaps
|
|
41
|
+
|
|
42
|
+
`assertUpstreamCliArgs` consults `UPSTREAM_CLI_CONTRACTS` on every real
|
|
43
|
+
`*_request` call. The following flags / mcpParameters were never registered
|
|
44
|
+
there before this release, so production calls setting any of them threw
|
|
45
|
+
"Upstream contract violation" at runtime even though the prepare-function
|
|
46
|
+
unit tests passed:
|
|
47
|
+
|
|
48
|
+
- **Gemini** (slice γ retroactive): `skipTrust` + `--skip-trust`.
|
|
49
|
+
- **Mistral** (slice γ + δ retroactive): `trust` + `--trust`; `maxTurns` +
|
|
50
|
+
`--max-turns`; `maxPrice` + `--max-price` (with a strict decimal-only
|
|
51
|
+
regex matching `MAX_PRICE_SCHEMA`'s lower bound).
|
|
52
|
+
- **Grok** (slice δ): `maxTurns` + `--max-turns`.
|
|
53
|
+
- **Codex** (slice α retroactive): `--output-schema` and `-c` removed
|
|
54
|
+
from `resumeForbiddenFlags` — verified accepted on `codex exec resume`
|
|
55
|
+
per codex-cli 0.133.0.
|
|
56
|
+
|
|
57
|
+
Conformance fixtures pin each new flag's argv shape, including a
|
|
58
|
+
`mistral-max-price-scientific-notation` fixture that locks the `1e-7`
|
|
59
|
+
rejection at the contract layer.
|
|
60
|
+
|
|
61
|
+
### Hardened — Test veracity (multi-LLM audit follow-up)
|
|
62
|
+
|
|
63
|
+
Codex + Grok ran iterative test-veracity audits with mutation probes per
|
|
64
|
+
`docs/plans/test-veracity-audit.spec.md`. They proved several added tests
|
|
65
|
+
were not falsifiable on the dimensions their commit messages claimed.
|
|
66
|
+
New file `src/__tests__/test-veracity-regressions.test.ts` closes those
|
|
67
|
+
gaps with six describe blocks:
|
|
68
|
+
|
|
69
|
+
- **REGRESSIONS A** — probes registered tool `inputSchema` bounds
|
|
70
|
+
directly (not the bare schema constants), so schema-drift in any of
|
|
71
|
+
the four sync/async registrations is caught.
|
|
72
|
+
- **REGRESSIONS B** — tests the pure `buildMistralRetryPrep` helper
|
|
73
|
+
across all combinations of `trust × maxTurns × maxPrice`. Self-
|
|
74
|
+
validated: dropping any of the three forwards on retry goes red.
|
|
75
|
+
- **REGRESSIONS C** — positive allowlist asserting slice α/γ/δ
|
|
76
|
+
parameters live in the matching contract's `mcpParameters` (closes
|
|
77
|
+
the self-oracle gap where removing a param from BOTH the contract
|
|
78
|
+
AND the schema previously stayed green).
|
|
79
|
+
- **REGRESSIONS D** — threads `prepare*Request` output into
|
|
80
|
+
`validateUpstreamCliArgs` end-to-end; the exact consistency check
|
|
81
|
+
the latent v1.8.0 contract breaks would have failed.
|
|
82
|
+
- **REGRESSIONS E** — `it.each` over sync AND async variants of every
|
|
83
|
+
slice-touched tool; the existing C4 was sync-only.
|
|
84
|
+
- **REGRESSIONS F** — flag-fixture coverage map: every flag in each
|
|
85
|
+
contract `flags` table must be exercised by a passing fixture (with
|
|
86
|
+
a grandfathered pre-audit baseline). Forces future slice authors to
|
|
87
|
+
add a fixture alongside any new flag entry.
|
|
88
|
+
|
|
89
|
+
The existing C4 (`MCP request schemas expose the provider contract
|
|
90
|
+
parameters`) now walks `_async` tools too.
|
|
91
|
+
|
|
92
|
+
### Notes
|
|
93
|
+
|
|
94
|
+
Multi-LLM review across multiple iterative rounds, ending with a
|
|
95
|
+
dedicated test-veracity audit per Werner's strict-evidence protocol
|
|
96
|
+
(documented in `docs/plans/test-veracity-audit.spec.md`). Round 2 of the
|
|
97
|
+
audit landed UNCONDITIONAL APPROVE from Codex, Grok, Claude, and Mistral
|
|
98
|
+
with full mutation-probe evidence — every documented counterexample
|
|
99
|
+
mutation went red as predicted; tests are falsifiable by exactly the
|
|
100
|
+
regressions they claim to guard against. Gemini was quota-exhausted
|
|
101
|
+
during the audit window (~6h reset) and did not participate in round 2.
|
|
102
|
+
|
|
5
103
|
## [1.8.0] - 2026-05-27 — Phase 4 openers (codex resume fix, mistral telemetry, headless trust flags)
|
|
6
104
|
|
|
7
105
|
Ships the first three slices of the Phase 4 provider-modernisation
|
package/dist/index.d.ts
CHANGED
|
@@ -54,6 +54,19 @@ declare const logger: {
|
|
|
54
54
|
debug: (message: string, ...args: any[]) => void;
|
|
55
55
|
};
|
|
56
56
|
type GatewayLogger = typeof logger;
|
|
57
|
+
/**
|
|
58
|
+
* Phase 4 slice δ — shared Zod fragments for `maxTurns` / `maxPrice`.
|
|
59
|
+
*
|
|
60
|
+
* Both flags reach the upstream CLIs as decimal-formatted argv strings via
|
|
61
|
+
* `String(N)`. `z.number().int().positive()` alone lets values past
|
|
62
|
+
* `Number.MAX_SAFE_INTEGER` through, after which `String(1e21)` emits
|
|
63
|
+
* scientific notation that Grok and Vibe both reject. The bounds below
|
|
64
|
+
* (safe-integer cap + 10000 ceiling for turns; finite + 10000 USD ceiling
|
|
65
|
+
* for price) guarantee a lossless decimal stringification AND a sane
|
|
66
|
+
* upper bound — no plausible single agent loop exceeds 10k turns or 10k USD.
|
|
67
|
+
*/
|
|
68
|
+
export declare const MAX_TURNS_SCHEMA: z.ZodNumber;
|
|
69
|
+
export declare const MAX_PRICE_SCHEMA: z.ZodNumber;
|
|
57
70
|
export declare const SESSION_PROVIDER_VALUES: readonly ["claude", "codex", "gemini", "grok", "mistral"];
|
|
58
71
|
export declare const SESSION_PROVIDER_ENUM: z.ZodEnum<["claude", "codex", "gemini", "grok", "mistral"]>;
|
|
59
72
|
export type SessionProvider = (typeof SESSION_PROVIDER_VALUES)[number];
|
|
@@ -215,6 +228,29 @@ export declare function prepareGeminiRequest(params: {
|
|
|
215
228
|
*/
|
|
216
229
|
skipTrust?: boolean;
|
|
217
230
|
}, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
|
|
231
|
+
export declare function prepareGrokRequest(params: {
|
|
232
|
+
prompt?: string;
|
|
233
|
+
promptParts?: PromptParts;
|
|
234
|
+
model?: string;
|
|
235
|
+
outputFormat?: string;
|
|
236
|
+
alwaysApprove?: boolean;
|
|
237
|
+
permissionMode?: string;
|
|
238
|
+
effort?: string;
|
|
239
|
+
reasoningEffort?: string;
|
|
240
|
+
allowedTools?: string[];
|
|
241
|
+
disallowedTools?: string[];
|
|
242
|
+
approvalStrategy: "legacy" | "mcp_managed";
|
|
243
|
+
approvalPolicy?: string;
|
|
244
|
+
mcpServers?: ClaudeMcpServerName[];
|
|
245
|
+
correlationId?: string;
|
|
246
|
+
optimizePrompt: boolean;
|
|
247
|
+
operation: string;
|
|
248
|
+
/**
|
|
249
|
+
* Phase 4 slice δ: emit `--max-turns N` so callers can cap agent-loop
|
|
250
|
+
* iterations for cost / latency control. Mirrors Claude's wiring.
|
|
251
|
+
*/
|
|
252
|
+
maxTurns?: number;
|
|
253
|
+
}, runtime?: GatewayServerRuntime): CliRequestPrep | ExtendedToolResponse;
|
|
218
254
|
export declare function prepareMistralRequest(params: {
|
|
219
255
|
prompt?: string;
|
|
220
256
|
promptParts?: PromptParts;
|
|
@@ -236,9 +272,29 @@ export declare function prepareMistralRequest(params: {
|
|
|
236
272
|
* prompt for this invocation only (not persisted). Default undefined.
|
|
237
273
|
*/
|
|
238
274
|
trust?: boolean;
|
|
275
|
+
/** Phase 4 slice δ: Vibe `--max-turns N` cap on agent-loop iterations. */
|
|
276
|
+
maxTurns?: number;
|
|
277
|
+
/** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
|
|
278
|
+
maxPrice?: number;
|
|
239
279
|
}, runtime?: GatewayServerRuntime): (CliRequestPrep & {
|
|
240
280
|
mistralEnv: Record<string, string>;
|
|
241
281
|
}) | ExtendedToolResponse;
|
|
282
|
+
/**
|
|
283
|
+
* Phase 4 slice δ post-review: pure helper extracted from
|
|
284
|
+
* `handleMistralRequest` so the retry-path arg-preservation invariants
|
|
285
|
+
* (trust + maxTurns + maxPrice from slices γ/δ) are unit-testable
|
|
286
|
+
* without mocking awaitJobOrDefer. Any param the wrapper threads into
|
|
287
|
+
* the FIRST `buildMistralCliInvocation` call MUST also be threaded
|
|
288
|
+
* through here, or a fresh-workspace / budgeted run can degrade on
|
|
289
|
+
* the second attempt.
|
|
290
|
+
*/
|
|
291
|
+
export declare function buildMistralRetryPrep(params: Pick<MistralRequestParams, "outputFormat" | "permissionMode" | "effort" | "reasoningEffort" | "allowedTools" | "disallowedTools" | "approvalStrategy" | "trust" | "maxTurns" | "maxPrice"> & {
|
|
292
|
+
effectivePrompt: string;
|
|
293
|
+
}, recoveryModel: string): {
|
|
294
|
+
args: string[];
|
|
295
|
+
env: Record<string, string>;
|
|
296
|
+
ignoredDisallowedTools: boolean;
|
|
297
|
+
};
|
|
242
298
|
export interface GeminiRequestParams {
|
|
243
299
|
prompt?: string;
|
|
244
300
|
promptParts?: PromptParts;
|
|
@@ -303,6 +359,8 @@ export interface GrokRequestParams {
|
|
|
303
359
|
optimizeResponse?: boolean;
|
|
304
360
|
idleTimeoutMs?: number;
|
|
305
361
|
forceRefresh?: boolean;
|
|
362
|
+
/** Phase 4 slice δ: cap agent-loop iterations via `--max-turns N`. */
|
|
363
|
+
maxTurns?: number;
|
|
306
364
|
}
|
|
307
365
|
export declare function handleGrokRequest(deps: HandlerDeps, params: GrokRequestParams): Promise<ExtendedToolResponse>;
|
|
308
366
|
export declare function handleGrokRequestAsync(deps: AsyncHandlerDeps, params: Omit<GrokRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;
|
|
@@ -329,6 +387,10 @@ export interface MistralRequestParams {
|
|
|
329
387
|
forceRefresh?: boolean;
|
|
330
388
|
/** Phase 4 slice γ: emit `--trust` for fresh-workspace headless runs. */
|
|
331
389
|
trust?: boolean;
|
|
390
|
+
/** Phase 4 slice δ: Vibe `--max-turns N` cap on agent-loop iterations. */
|
|
391
|
+
maxTurns?: number;
|
|
392
|
+
/** Phase 4 slice δ: Vibe `--max-price DOLLARS` cumulative-cost cap. */
|
|
393
|
+
maxPrice?: number;
|
|
332
394
|
}
|
|
333
395
|
export declare function handleMistralRequest(deps: HandlerDeps, params: MistralRequestParams): Promise<ExtendedToolResponse>;
|
|
334
396
|
export declare function handleMistralRequestAsync(deps: AsyncHandlerDeps, params: Omit<MistralRequestParams, "optimizeResponse">): Promise<ExtendedToolResponse>;
|
package/dist/index.js
CHANGED
|
@@ -229,6 +229,23 @@ function getApprovalManager(runtimeLogger = logger) {
|
|
|
229
229
|
return approvalManager;
|
|
230
230
|
}
|
|
231
231
|
const MCP_SERVER_ENUM = z.enum(CLAUDE_MCP_SERVER_NAMES);
|
|
232
|
+
/**
|
|
233
|
+
* Phase 4 slice δ — shared Zod fragments for `maxTurns` / `maxPrice`.
|
|
234
|
+
*
|
|
235
|
+
* Both flags reach the upstream CLIs as decimal-formatted argv strings via
|
|
236
|
+
* `String(N)`. `z.number().int().positive()` alone lets values past
|
|
237
|
+
* `Number.MAX_SAFE_INTEGER` through, after which `String(1e21)` emits
|
|
238
|
+
* scientific notation that Grok and Vibe both reject. The bounds below
|
|
239
|
+
* (safe-integer cap + 10000 ceiling for turns; finite + 10000 USD ceiling
|
|
240
|
+
* for price) guarantee a lossless decimal stringification AND a sane
|
|
241
|
+
* upper bound — no plausible single agent loop exceeds 10k turns or 10k USD.
|
|
242
|
+
*/
|
|
243
|
+
export const MAX_TURNS_SCHEMA = z.number().int().positive().safe().max(10_000);
|
|
244
|
+
// `.min(1e-6)` keeps the value in JS's decimal-stringify range:
|
|
245
|
+
// String(1e-6) === "0.000001" but String(1e-7) === "1e-7", which both
|
|
246
|
+
// upstream CLIs would reject. 1µUSD per request is fine-grained enough
|
|
247
|
+
// for any plausible budget-cap use.
|
|
248
|
+
export const MAX_PRICE_SCHEMA = z.number().positive().finite().min(1e-6).max(10_000);
|
|
232
249
|
// U22: Session-provider enum extended to five providers. The storage layer's
|
|
233
250
|
// CLI_TYPES already includes "mistral"; the MCP-tool layer mirrors that here so
|
|
234
251
|
// session_create / session_list / session_clear_all accept the fifth provider.
|
|
@@ -1273,7 +1290,7 @@ export function prepareGeminiRequest(params, runtime = resolveGatewayServerRunti
|
|
|
1273
1290
|
stablePrefixTokens,
|
|
1274
1291
|
};
|
|
1275
1292
|
}
|
|
1276
|
-
function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime()) {
|
|
1293
|
+
export function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime()) {
|
|
1277
1294
|
const corrId = params.correlationId || randomUUID();
|
|
1278
1295
|
const cliInfo = getCliInfo();
|
|
1279
1296
|
const resolvedModel = resolveModelAlias("grok", params.model, cliInfo);
|
|
@@ -1349,6 +1366,9 @@ function prepareGrokRequest(params, runtime = resolveGatewayServerRuntime()) {
|
|
|
1349
1366
|
if (params.disallowedTools && params.disallowedTools.length > 0) {
|
|
1350
1367
|
args.push("--disallowed-tools", params.disallowedTools.join(","));
|
|
1351
1368
|
}
|
|
1369
|
+
if (params.maxTurns !== undefined) {
|
|
1370
|
+
args.push("--max-turns", String(params.maxTurns));
|
|
1371
|
+
}
|
|
1352
1372
|
return {
|
|
1353
1373
|
corrId,
|
|
1354
1374
|
effectivePrompt,
|
|
@@ -1433,6 +1453,8 @@ export function prepareMistralRequest(params, runtime = resolveGatewayServerRunt
|
|
|
1433
1453
|
allowedTools: params.allowedTools,
|
|
1434
1454
|
disallowedTools: params.disallowedTools,
|
|
1435
1455
|
trust: params.trust,
|
|
1456
|
+
maxTurns: params.maxTurns,
|
|
1457
|
+
maxPrice: params.maxPrice,
|
|
1436
1458
|
});
|
|
1437
1459
|
if (prep.ignoredDisallowedTools) {
|
|
1438
1460
|
runtime.logger.info(`[${corrId}] Mistral does not support disallowedTools; ignoring (caller passed ${params.disallowedTools?.length ?? 0} entries)`);
|
|
@@ -1463,6 +1485,32 @@ function selectMistralRecoveryModel(failedModel) {
|
|
|
1463
1485
|
].filter((model) => Boolean(model && model !== failedModel));
|
|
1464
1486
|
return candidates.find(model => model !== "local");
|
|
1465
1487
|
}
|
|
1488
|
+
/**
|
|
1489
|
+
* Phase 4 slice δ post-review: pure helper extracted from
|
|
1490
|
+
* `handleMistralRequest` so the retry-path arg-preservation invariants
|
|
1491
|
+
* (trust + maxTurns + maxPrice from slices γ/δ) are unit-testable
|
|
1492
|
+
* without mocking awaitJobOrDefer. Any param the wrapper threads into
|
|
1493
|
+
* the FIRST `buildMistralCliInvocation` call MUST also be threaded
|
|
1494
|
+
* through here, or a fresh-workspace / budgeted run can degrade on
|
|
1495
|
+
* the second attempt.
|
|
1496
|
+
*/
|
|
1497
|
+
export function buildMistralRetryPrep(params, recoveryModel) {
|
|
1498
|
+
return buildMistralCliInvocation({
|
|
1499
|
+
prompt: params.effectivePrompt,
|
|
1500
|
+
resolvedModel: recoveryModel,
|
|
1501
|
+
outputFormat: params.outputFormat,
|
|
1502
|
+
permissionMode: params.approvalStrategy === "mcp_managed"
|
|
1503
|
+
? "auto-approve"
|
|
1504
|
+
: (params.permissionMode ?? "auto-approve"),
|
|
1505
|
+
effort: params.effort,
|
|
1506
|
+
reasoningEffort: params.reasoningEffort,
|
|
1507
|
+
allowedTools: params.allowedTools,
|
|
1508
|
+
disallowedTools: params.disallowedTools,
|
|
1509
|
+
trust: params.trust,
|
|
1510
|
+
maxTurns: params.maxTurns,
|
|
1511
|
+
maxPrice: params.maxPrice,
|
|
1512
|
+
});
|
|
1513
|
+
}
|
|
1466
1514
|
function buildCliResponse(cli, stdout, optimizeResponse, corrId, sessionId, prep, durationMs, resumable, outputFormat, warnings) {
|
|
1467
1515
|
let finalStdout = stdout;
|
|
1468
1516
|
// Skip response optimization for JSON output to prevent corrupting structured data
|
|
@@ -1801,6 +1849,7 @@ export async function handleGrokRequest(deps, params) {
|
|
|
1801
1849
|
correlationId: params.correlationId,
|
|
1802
1850
|
optimizePrompt: params.optimizePrompt,
|
|
1803
1851
|
operation: "grok_request",
|
|
1852
|
+
maxTurns: params.maxTurns,
|
|
1804
1853
|
}, runtime);
|
|
1805
1854
|
if (!("args" in prep))
|
|
1806
1855
|
return prep;
|
|
@@ -1921,6 +1970,7 @@ export async function handleGrokRequestAsync(deps, params) {
|
|
|
1921
1970
|
correlationId: params.correlationId,
|
|
1922
1971
|
optimizePrompt: params.optimizePrompt,
|
|
1923
1972
|
operation: "grok_request_async",
|
|
1973
|
+
maxTurns: params.maxTurns,
|
|
1924
1974
|
}, runtime);
|
|
1925
1975
|
if (!("args" in prep))
|
|
1926
1976
|
return prep;
|
|
@@ -2003,6 +2053,8 @@ export async function handleMistralRequest(deps, params) {
|
|
|
2003
2053
|
optimizePrompt: params.optimizePrompt,
|
|
2004
2054
|
operation: "mistral_request",
|
|
2005
2055
|
trust: params.trust,
|
|
2056
|
+
maxTurns: params.maxTurns,
|
|
2057
|
+
maxPrice: params.maxPrice,
|
|
2006
2058
|
}, runtime);
|
|
2007
2059
|
if (!("args" in prep))
|
|
2008
2060
|
return prep;
|
|
@@ -2035,22 +2087,7 @@ export async function handleMistralRequest(deps, params) {
|
|
|
2035
2087
|
const recoveryModel = selectMistralRecoveryModel(prep.resolvedModel);
|
|
2036
2088
|
if (recoveryModel) {
|
|
2037
2089
|
deps.logger.info(`[${corrId}] mistral_request detected stale Vibe model selection; retrying once with ${recoveryModel}`);
|
|
2038
|
-
const retryPrep =
|
|
2039
|
-
prompt: prep.effectivePrompt,
|
|
2040
|
-
resolvedModel: recoveryModel,
|
|
2041
|
-
outputFormat: params.outputFormat,
|
|
2042
|
-
permissionMode: params.approvalStrategy === "mcp_managed"
|
|
2043
|
-
? "auto-approve"
|
|
2044
|
-
: (params.permissionMode ?? "auto-approve"),
|
|
2045
|
-
effort: params.effort,
|
|
2046
|
-
reasoningEffort: params.reasoningEffort,
|
|
2047
|
-
allowedTools: params.allowedTools,
|
|
2048
|
-
disallowedTools: params.disallowedTools,
|
|
2049
|
-
// Phase 4 slice γ: preserve --trust on the model-selection retry
|
|
2050
|
-
// so a fresh untrusted workspace doesn't block headlessly on the
|
|
2051
|
-
// second attempt after surviving the first.
|
|
2052
|
-
trust: params.trust,
|
|
2053
|
-
});
|
|
2090
|
+
const retryPrep = buildMistralRetryPrep({ ...params, effectivePrompt: prep.effectivePrompt }, recoveryModel);
|
|
2054
2091
|
const retryArgs = [...retryPrep.args, ...sessionResult.resumeArgs];
|
|
2055
2092
|
// Reuse the FR handoff built above — the retry preserves corrId,
|
|
2056
2093
|
// so the manager's logComplete still updates the original row.
|
|
@@ -2151,6 +2188,8 @@ export async function handleMistralRequestAsync(deps, params) {
|
|
|
2151
2188
|
optimizePrompt: params.optimizePrompt,
|
|
2152
2189
|
operation: "mistral_request_async",
|
|
2153
2190
|
trust: params.trust,
|
|
2191
|
+
maxTurns: params.maxTurns,
|
|
2192
|
+
maxPrice: params.maxPrice,
|
|
2154
2193
|
}, runtime);
|
|
2155
2194
|
if (!("args" in prep))
|
|
2156
2195
|
return prep;
|
|
@@ -3142,7 +3181,8 @@ export function createGatewayServer(deps = {}) {
|
|
|
3142
3181
|
.boolean()
|
|
3143
3182
|
.default(false)
|
|
3144
3183
|
.describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
|
|
3145
|
-
|
|
3184
|
+
maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
|
|
3185
|
+
}, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, maxTurns, }) => {
|
|
3146
3186
|
return handleGrokRequest({ sessionManager, logger, runtime }, {
|
|
3147
3187
|
prompt,
|
|
3148
3188
|
promptParts,
|
|
@@ -3165,6 +3205,7 @@ export function createGatewayServer(deps = {}) {
|
|
|
3165
3205
|
optimizeResponse,
|
|
3166
3206
|
idleTimeoutMs,
|
|
3167
3207
|
forceRefresh,
|
|
3208
|
+
maxTurns,
|
|
3168
3209
|
});
|
|
3169
3210
|
});
|
|
3170
3211
|
//──────────────────────────────────────────────────────────────────────────────
|
|
@@ -3242,7 +3283,9 @@ export function createGatewayServer(deps = {}) {
|
|
|
3242
3283
|
.boolean()
|
|
3243
3284
|
.default(false)
|
|
3244
3285
|
.describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
|
|
3245
|
-
|
|
3286
|
+
maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
|
|
3287
|
+
maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
|
|
3288
|
+
}, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, optimizeResponse, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
|
|
3246
3289
|
return handleMistralRequest({ sessionManager, logger, runtime }, {
|
|
3247
3290
|
prompt,
|
|
3248
3291
|
promptParts,
|
|
@@ -3265,6 +3308,8 @@ export function createGatewayServer(deps = {}) {
|
|
|
3265
3308
|
idleTimeoutMs,
|
|
3266
3309
|
forceRefresh,
|
|
3267
3310
|
trust,
|
|
3311
|
+
maxTurns,
|
|
3312
|
+
maxPrice,
|
|
3268
3313
|
});
|
|
3269
3314
|
});
|
|
3270
3315
|
//──────────────────────────────────────────────────────────────────────────────
|
|
@@ -3753,7 +3798,8 @@ export function createGatewayServer(deps = {}) {
|
|
|
3753
3798
|
.boolean()
|
|
3754
3799
|
.default(false)
|
|
3755
3800
|
.describe("Bypass dedup and force a fresh CLI run even if a recent identical request exists"),
|
|
3756
|
-
|
|
3801
|
+
maxTurns: MAX_TURNS_SCHEMA.optional().describe("Grok `--max-turns N`: cap on agent-loop iterations for cost / latency control (Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
|
|
3802
|
+
}, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, alwaysApprove, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, maxTurns, }) => {
|
|
3757
3803
|
return handleGrokRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
|
|
3758
3804
|
prompt,
|
|
3759
3805
|
promptParts,
|
|
@@ -3775,6 +3821,7 @@ export function createGatewayServer(deps = {}) {
|
|
|
3775
3821
|
optimizePrompt,
|
|
3776
3822
|
idleTimeoutMs,
|
|
3777
3823
|
forceRefresh,
|
|
3824
|
+
maxTurns,
|
|
3778
3825
|
});
|
|
3779
3826
|
});
|
|
3780
3827
|
server.tool("mistral_request_async", {
|
|
@@ -3848,7 +3895,9 @@ export function createGatewayServer(deps = {}) {
|
|
|
3848
3895
|
.boolean()
|
|
3849
3896
|
.default(false)
|
|
3850
3897
|
.describe("Emit `--trust` so Vibe trusts the cwd for this invocation only (not persisted to trusted_folders.toml) and skips the interactive trust prompt (Phase 4 slice γ)."),
|
|
3851
|
-
|
|
3898
|
+
maxTurns: MAX_TURNS_SCHEMA.optional().describe("Vibe `--max-turns N`: cap the agent-loop iteration count (programmatic mode only, Phase 4 slice δ). Bounded to safe integers ≤ 10000."),
|
|
3899
|
+
maxPrice: MAX_PRICE_SCHEMA.optional().describe("Vibe `--max-price DOLLARS`: interrupt the session when cumulative cost crosses this cap (programmatic mode only, Phase 4 slice δ). Bounded to finite values ≤ 10000 USD."),
|
|
3900
|
+
}, async ({ prompt, promptParts, model, outputFormat, sessionId, resumeLatest, createNewSession, permissionMode, effort, reasoningEffort, approvalStrategy, approvalPolicy, mcpServers, allowedTools, disallowedTools, correlationId, optimizePrompt, idleTimeoutMs, forceRefresh, trust, maxTurns, maxPrice, }) => {
|
|
3852
3901
|
return handleMistralRequestAsync({ sessionManager, asyncJobManager, logger, runtime }, {
|
|
3853
3902
|
prompt,
|
|
3854
3903
|
promptParts,
|
|
@@ -3870,6 +3919,8 @@ export function createGatewayServer(deps = {}) {
|
|
|
3870
3919
|
idleTimeoutMs,
|
|
3871
3920
|
forceRefresh,
|
|
3872
3921
|
trust,
|
|
3922
|
+
maxTurns,
|
|
3923
|
+
maxPrice,
|
|
3873
3924
|
});
|
|
3874
3925
|
});
|
|
3875
3926
|
server.tool("llm_job_status", {
|
|
@@ -114,6 +114,17 @@ export interface PrepareMistralRequestInput {
|
|
|
114
114
|
* Vibe's prompt behaviour is preserved for existing callers.
|
|
115
115
|
*/
|
|
116
116
|
trust?: boolean;
|
|
117
|
+
/**
|
|
118
|
+
* Phase 4 slice δ: emit `--max-turns N` to cap the agent-loop iteration
|
|
119
|
+
* count (only applies in programmatic mode with `-p`).
|
|
120
|
+
*/
|
|
121
|
+
maxTurns?: number;
|
|
122
|
+
/**
|
|
123
|
+
* Phase 4 slice δ: emit `--max-price DOLLARS` so the session is
|
|
124
|
+
* interrupted when cumulative cost crosses the cap (programmatic mode
|
|
125
|
+
* only).
|
|
126
|
+
*/
|
|
127
|
+
maxPrice?: number;
|
|
117
128
|
}
|
|
118
129
|
export interface PrepareMistralRequestResult {
|
|
119
130
|
args: string[];
|
package/dist/request-helpers.js
CHANGED
|
@@ -179,6 +179,12 @@ export function prepareMistralRequest(input) {
|
|
|
179
179
|
if (input.trust) {
|
|
180
180
|
args.push("--trust");
|
|
181
181
|
}
|
|
182
|
+
if (input.maxTurns !== undefined) {
|
|
183
|
+
args.push("--max-turns", String(input.maxTurns));
|
|
184
|
+
}
|
|
185
|
+
if (input.maxPrice !== undefined) {
|
|
186
|
+
args.push("--max-price", String(input.maxPrice));
|
|
187
|
+
}
|
|
182
188
|
const ignoredDisallowedTools = Boolean(input.disallowedTools && input.disallowedTools.length > 0);
|
|
183
189
|
return { args, env, ignoredDisallowedTools };
|
|
184
190
|
}
|
|
@@ -133,14 +133,11 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
133
133
|
"ignoreRules",
|
|
134
134
|
],
|
|
135
135
|
resumeOnlyFlags: ["--last"],
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
"--search",
|
|
142
|
-
"-c",
|
|
143
|
-
],
|
|
136
|
+
// Phase 4 slice α (v1.8.0) verified that `codex exec resume` accepts
|
|
137
|
+
// `--output-schema` and `-c` (codex-cli 0.133.0 `exec resume --help`),
|
|
138
|
+
// so they're no longer forbidden. `--search` stays forbidden (resume
|
|
139
|
+
// inherits the original session's web-search state).
|
|
140
|
+
resumeForbiddenFlags: ["--sandbox", "--ask-for-approval", "--full-auto", "--search"],
|
|
144
141
|
flags: {
|
|
145
142
|
"--last": { arity: "none", description: "Resume latest session" },
|
|
146
143
|
"--model": { arity: "one", description: "Model selector" },
|
|
@@ -189,9 +186,24 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
189
186
|
expect: "fail",
|
|
190
187
|
},
|
|
191
188
|
{
|
|
189
|
+
// Phase 4 slice α: --output-schema IS accepted on resume per
|
|
190
|
+
// codex-cli 0.133.0; this fixture pins the new behaviour so future
|
|
191
|
+
// contract changes can't silently regress.
|
|
192
192
|
id: "codex-resume-output-schema",
|
|
193
|
-
description: "
|
|
193
|
+
description: "Phase 4 slice α: --output-schema accepted on resume (codex-cli 0.133.0)",
|
|
194
194
|
args: ["exec", "resume", "--output-schema", "/tmp/schema.json", "session-id", "hello"],
|
|
195
|
+
expect: "pass",
|
|
196
|
+
},
|
|
197
|
+
{
|
|
198
|
+
id: "codex-resume-config-override",
|
|
199
|
+
description: "Phase 4 slice α: -c key=value accepted on resume",
|
|
200
|
+
args: ["exec", "resume", "-c", "model.foo=bar", "session-id", "hello"],
|
|
201
|
+
expect: "pass",
|
|
202
|
+
},
|
|
203
|
+
{
|
|
204
|
+
id: "codex-resume-search-still-forbidden",
|
|
205
|
+
description: "Phase 4 slice α: --search remains forbidden on resume",
|
|
206
|
+
args: ["exec", "resume", "--search", "session-id", "hello"],
|
|
195
207
|
expect: "fail",
|
|
196
208
|
},
|
|
197
209
|
],
|
|
@@ -219,6 +231,8 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
219
231
|
"policyFiles",
|
|
220
232
|
"adminPolicyFiles",
|
|
221
233
|
"attachments",
|
|
234
|
+
// Phase 4 slice γ
|
|
235
|
+
"skipTrust",
|
|
222
236
|
],
|
|
223
237
|
flags: {
|
|
224
238
|
"-p": { arity: "one", description: "Prompt text" },
|
|
@@ -236,6 +250,10 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
236
250
|
"--admin-policy": { arity: "one", description: "Admin policy file path" },
|
|
237
251
|
"-o": { arity: "one", values: ["json"], description: "Output format" },
|
|
238
252
|
"--resume": { arity: "one", description: "Resume session" },
|
|
253
|
+
"--skip-trust": {
|
|
254
|
+
arity: "none",
|
|
255
|
+
description: "Trust workspace for this session (Phase 4 slice γ)",
|
|
256
|
+
},
|
|
239
257
|
},
|
|
240
258
|
env: {},
|
|
241
259
|
conformanceFixtures: [
|
|
@@ -251,6 +269,12 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
251
269
|
args: ["-p", "hello", "--not-a-gemini-flag"],
|
|
252
270
|
expect: "fail",
|
|
253
271
|
},
|
|
272
|
+
{
|
|
273
|
+
id: "gemini-skip-trust",
|
|
274
|
+
description: "Phase 4 slice γ: --skip-trust is accepted",
|
|
275
|
+
args: ["-p", "hello", "--skip-trust"],
|
|
276
|
+
expect: "pass",
|
|
277
|
+
},
|
|
254
278
|
],
|
|
255
279
|
},
|
|
256
280
|
grok: {
|
|
@@ -275,6 +299,8 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
275
299
|
"mcpServers",
|
|
276
300
|
"allowedTools",
|
|
277
301
|
"disallowedTools",
|
|
302
|
+
// Phase 4 slice δ
|
|
303
|
+
"maxTurns",
|
|
278
304
|
],
|
|
279
305
|
flags: {
|
|
280
306
|
"-p": { arity: "one", description: "Prompt text" },
|
|
@@ -299,6 +325,11 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
299
325
|
},
|
|
300
326
|
"--resume": { arity: "one", description: "Resume session" },
|
|
301
327
|
"--continue": { arity: "none", description: "Continue latest session" },
|
|
328
|
+
"--max-turns": {
|
|
329
|
+
arity: "one",
|
|
330
|
+
pattern: /^[1-9][0-9]*$/,
|
|
331
|
+
description: "Agent-loop iteration cap (Phase 4 slice δ)",
|
|
332
|
+
},
|
|
302
333
|
},
|
|
303
334
|
env: {},
|
|
304
335
|
conformanceFixtures: [
|
|
@@ -314,6 +345,18 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
314
345
|
args: ["-p", "hello", "--not-a-grok-flag"],
|
|
315
346
|
expect: "fail",
|
|
316
347
|
},
|
|
348
|
+
{
|
|
349
|
+
id: "grok-max-turns",
|
|
350
|
+
description: "Phase 4 slice δ: --max-turns N is accepted",
|
|
351
|
+
args: ["-p", "hello", "--max-turns", "5"],
|
|
352
|
+
expect: "pass",
|
|
353
|
+
},
|
|
354
|
+
{
|
|
355
|
+
id: "grok-max-turns-invalid-zero",
|
|
356
|
+
description: "Phase 4 slice δ: --max-turns 0 is rejected by contract pattern",
|
|
357
|
+
args: ["-p", "hello", "--max-turns", "0"],
|
|
358
|
+
expect: "fail",
|
|
359
|
+
},
|
|
317
360
|
],
|
|
318
361
|
},
|
|
319
362
|
mistral: {
|
|
@@ -337,6 +380,11 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
337
380
|
"mcpServers",
|
|
338
381
|
"allowedTools",
|
|
339
382
|
"disallowedTools",
|
|
383
|
+
// Phase 4 slice γ
|
|
384
|
+
"trust",
|
|
385
|
+
// Phase 4 slice δ
|
|
386
|
+
"maxTurns",
|
|
387
|
+
"maxPrice",
|
|
340
388
|
],
|
|
341
389
|
flags: {
|
|
342
390
|
"-p": { arity: "one", description: "Prompt text" },
|
|
@@ -355,6 +403,22 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
355
403
|
"--enabled-tools": { arity: "one", description: "Enabled tool" },
|
|
356
404
|
"--resume": { arity: "one", description: "Resume session" },
|
|
357
405
|
"--continue": { arity: "none", description: "Continue latest session" },
|
|
406
|
+
"--trust": {
|
|
407
|
+
arity: "none",
|
|
408
|
+
description: "Trust cwd for this invocation only (Phase 4 slice γ)",
|
|
409
|
+
},
|
|
410
|
+
"--max-turns": {
|
|
411
|
+
arity: "one",
|
|
412
|
+
pattern: /^[1-9][0-9]*$/,
|
|
413
|
+
description: "Agent-loop iteration cap (Phase 4 slice δ, programmatic mode only)",
|
|
414
|
+
},
|
|
415
|
+
"--max-price": {
|
|
416
|
+
arity: "one",
|
|
417
|
+
// Decimal-only: matches the MAX_PRICE_SCHEMA min(1e-6) lower bound
|
|
418
|
+
// that keeps String(N) in decimal form (no scientific notation).
|
|
419
|
+
pattern: /^(0|[1-9][0-9]*)(\.[0-9]+)?$/,
|
|
420
|
+
description: "Cumulative cost cap in USD (Phase 4 slice δ, programmatic mode only)",
|
|
421
|
+
},
|
|
358
422
|
},
|
|
359
423
|
env: {
|
|
360
424
|
VIBE_ACTIVE_MODEL: {
|
|
@@ -378,6 +442,27 @@ export const UPSTREAM_CLI_CONTRACTS = {
|
|
|
378
442
|
env: { CODEX_MODEL: "gpt-5.5" },
|
|
379
443
|
expect: "fail",
|
|
380
444
|
},
|
|
445
|
+
{
|
|
446
|
+
id: "mistral-trust",
|
|
447
|
+
description: "Phase 4 slice γ: --trust is accepted",
|
|
448
|
+
args: ["-p", "hello", "--agent", "auto-approve", "--trust"],
|
|
449
|
+
env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
|
|
450
|
+
expect: "pass",
|
|
451
|
+
},
|
|
452
|
+
{
|
|
453
|
+
id: "mistral-max-turns-and-price",
|
|
454
|
+
description: "Phase 4 slice δ: --max-turns + --max-price are accepted together",
|
|
455
|
+
args: ["-p", "hello", "--agent", "auto-approve", "--max-turns", "3", "--max-price", "0.01"],
|
|
456
|
+
env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
|
|
457
|
+
expect: "pass",
|
|
458
|
+
},
|
|
459
|
+
{
|
|
460
|
+
id: "mistral-max-price-scientific-notation",
|
|
461
|
+
description: "Phase 4 slice δ: scientific-notation --max-price is rejected by contract pattern (matches MAX_PRICE_SCHEMA bounds)",
|
|
462
|
+
args: ["-p", "hello", "--agent", "auto-approve", "--max-price", "1e-7"],
|
|
463
|
+
env: { VIBE_ACTIVE_MODEL: "mistral-medium-3.5" },
|
|
464
|
+
expect: "fail",
|
|
465
|
+
},
|
|
381
466
|
],
|
|
382
467
|
},
|
|
383
468
|
};
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "llm-cli-gateway",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.9.0",
|
|
4
4
|
"mcpName": "io.github.verivus-oss/llm-cli-gateway",
|
|
5
5
|
"description": "MCP server providing unified access to Claude Code, Codex, Gemini, Grok, and Mistral Vibe CLIs with session management, retry logic, async job orchestration, durable job results, and cross-LLM validation.",
|
|
6
6
|
"license": "MIT",
|