llm-cli-gateway 1.15.0 → 1.15.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,56 @@
2
2
 
3
3
  All notable changes to the llm-cli-gateway project.
4
4
 
5
+ ## Unreleased
6
+
7
+ ## [1.15.2] - 2026-05-29 — security quality follow-up
8
+
9
+ Patch release for GitHub Security & quality follow-up findings and Scorecard
10
+ documentation.
11
+
12
+ ### Fixed
13
+
14
+ - Preserve the leading content when truncating async job stdout/stderr in
15
+ `llm_job_result`, matching bounded-result consumer expectations instead of
16
+ returning only the tail.
17
+ - Handle installer gateway log file close errors explicitly so failed flushes
18
+ from writable stdout/stderr log handles are surfaced to callers.
19
+
20
+ ### Changed
21
+
22
+ - Moved non-canonical root Markdown into `docs/guides/` and `docs/archive/`
23
+ so the repository root stays focused on public entry points.
24
+ - Renamed async-defer result guidance from the old retrieval field to `collectWith`,
25
+ avoiding Socket substring false positives in generated package code.
26
+ - Recorded OpenSSF Scorecard `FuzzingID` as a valid roadmap/process item:
27
+ adding `fast-check` style property tests for parser, argv, and worktree
28
+ surfaces would improve the Scorecard signal, but the absence of fuzzing does
29
+ not block this patch release.
30
+
31
+ ## [1.15.1] - 2026-05-29 — quality badges + Sigstore release signing
32
+
33
+ Release-infrastructure follow-up to v1.15.0.
34
+
35
+ ### Added
36
+
37
+ - README quality badges for CI, security, OpenSSF Scorecard, npm, license, and
38
+ Sigstore-signed release artifacts.
39
+ - Sigstore keyless signing for GitHub release installer artifacts, including
40
+ `.sigstore.json` bundles and pre-upload verification in the release workflow.
41
+ - End-user verification guidance for `SHA256SUMS.sigstore.json` before trusting
42
+ release checksums.
43
+ - Sanitized Windows Claude Desktop MCP config example using 1Password
44
+ environment injection placeholders.
45
+ - Security workflow attribution guard that rejects new Claude/Anthropic
46
+ author/co-author metadata in future commits.
47
+
48
+ ### Changed
49
+
50
+ - Manual release-installer rebuilds now fail fast unless launched from the
51
+ matching release tag ref, keeping Sigstore certificate identities stable.
52
+ - Windows installer snippets and generated release manifest commands now verify
53
+ the Sigstore checksum bundle before executing the downloaded bootstrapper.
54
+
5
55
  ## [1.15.0] - 2026-05-28 — Phase 4 slice λ (gateway-owned worktree lifecycle)
6
56
 
7
57
  Ships the tenth Phase 4 slice: a new top-level `worktree` field on every
@@ -1097,11 +1147,11 @@ Technical corrections from the multi-LLM voice + technical review:
1097
1147
 
1098
1148
  ### Fixed — `socket.yml` networkAccess false-positive documentation
1099
1149
 
1100
- - Documented that the `globalThis["fetch"]` flag on `dist/index.js` /
1101
- `dist/job-store.js` is a substring-match false positive. Neither file
1102
- contains any actual fetch call; the matches are English-prose
1103
- occurrences in an error message, the `fetchWith` JSON field name, and
1104
- a code comment. Verified by sub-agent investigation, no code change
1150
+ - Documented that Socket's network-access flag on `dist/index.js` /
1151
+ `dist/job-store.js` was a substring-match false positive. Neither file
1152
+ contained a production network call; the matches were English-prose
1153
+ retrieval wording in an error message, a structured result-tool field name,
1154
+ and a code comment. Verified by sub-agent investigation, no code change
1105
1155
  required, no attack-surface delta vs 1.5.35.
1106
1156
 
1107
1157
  ### Fixed — `lychee.toml` exclusions
package/README.md CHANGED
@@ -1,25 +1,44 @@
1
1
  # llm-cli-gateway
2
2
 
3
- > *"Without consultation, plans are frustrated, but with many counselors they succeed."*
3
+ [![CI](https://github.com/verivus-oss/llm-cli-gateway/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/verivus-oss/llm-cli-gateway/actions/workflows/ci.yml)
4
+ [![Security](https://github.com/verivus-oss/llm-cli-gateway/actions/workflows/security.yml/badge.svg?branch=main)](https://github.com/verivus-oss/llm-cli-gateway/actions/workflows/security.yml)
5
+ [![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/verivus-oss/llm-cli-gateway/badge)](https://scorecard.dev/viewer/?uri=github.com/verivus-oss/llm-cli-gateway)
6
+ [![OpenSSF Best Practices](https://www.bestpractices.dev/projects/13025/badge)](https://www.bestpractices.dev/projects/13025)
7
+ [![npm](https://img.shields.io/npm/v/llm-cli-gateway.svg)](https://www.npmjs.com/package/llm-cli-gateway)
8
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
9
+ [![Releases: Sigstore signed](https://img.shields.io/badge/releases-Sigstore%20signed-2e7d32.svg)](SECURITY.md#release-signing)
10
+
11
+ > _"Without consultation, plans are frustrated, but with many counselors they succeed."_
4
12
  > — Proverbs 15:22 (LSB)
5
13
 
6
- A Model Context Protocol (MCP) server providing unified access to Claude Code, Codex, Gemini, Grok, and Mistral (Vibe) CLIs with session management, retry logic, and async job orchestration.
14
+ A Model Context Protocol (MCP) gateway for running Claude Code, Codex, Gemini, Grok, and Mistral (Vibe) CLIs from one MCP endpoint, with durable async jobs, session continuity, cache-aware prompting, observability, and personal-appliance setup tooling.
7
15
 
8
- ## Personal MCP Appliance MVP
16
+ ## What It Provides Today
9
17
 
10
- `llm-cli-gateway` is being packaged as a single-user personal MCP appliance for cross-LLM validation. The intended workflow is: connect one MCP endpoint, ask any client for cross-LLM validation.
18
+ `llm-cli-gateway` is a single-user MCP gateway for cross-LLM validation and multi-agent coding workflows. It is more than a thin CLI wrapper:
19
+
20
+ - Runs five provider CLIs through consistent sync and async MCP tools.
21
+ - Persists long-running jobs, supports restart-safe result collection, deduplication, cancellation, and sync-to-async deferral.
22
+ - Tracks sessions, real CLI resume paths, structured response metadata, and cache telemetry.
23
+ - Supports cache-aware `promptParts`, including explicit Claude `cache_control` when opted in.
24
+ - Can run requests inside gateway-managed git worktrees for isolated multi-agent review and implementation loops.
25
+ - Ships personal-appliance setup surfaces: HTTP transport with bearer-token auth, `doctor --json`, setup UI artifacts, provider setup snippets, Docker fallback, and checked release bundles.
26
+
27
+ ## Personal MCP Appliance
28
+
29
+ The personal-appliance contract keeps that surface intentionally narrow: one trusted user runs the gateway on a machine or volume they own, connects one MCP endpoint, and asks any connected client for cross-LLM validation.
11
30
 
12
31
  The product contract is documented in [docs/personal-mcp/PRODUCT_CONTRACT.md](docs/personal-mcp/PRODUCT_CONTRACT.md). It defines the single-user scope, security posture, target support matrix, and provider-support verification gates. Public setup guides must not claim ChatGPT, Claude web, Claude Desktop, Codex, Gemini CLI, Gemini web, or Grok inbound support until the corresponding provider/client path has been verified.
13
32
 
14
33
  This project does not provide hosted multi-tenant credential custody. Provider credentials stay on the user's machine or user-owned deployment volume.
15
34
 
16
- MVP release readiness is tracked in [docs/personal-mcp/RELEASE_READINESS.md](docs/personal-mcp/RELEASE_READINESS.md). Dogfooding evidence (which target LLMs guided setup, what unsafe suggestions were captured, which findings are deferred to post-MVP work) is in [docs/personal-mcp/DOGFOODING_RESULTS.md](docs/personal-mcp/DOGFOODING_RESULTS.md).
35
+ Release-readiness history is tracked in [docs/personal-mcp/RELEASE_READINESS.md](docs/personal-mcp/RELEASE_READINESS.md). Dogfooding evidence (which target LLMs guided setup, what unsafe suggestions were captured, and which findings were deferred from the initial personal-appliance rollout) is in [docs/personal-mcp/DOGFOODING_RESULTS.md](docs/personal-mcp/DOGFOODING_RESULTS.md).
17
36
 
18
37
  Current personal-appliance artifacts include:
19
38
 
20
39
  - Streamable HTTP startup: `LLM_GATEWAY_AUTH_TOKEN=<token> npm run start:http`
21
40
  - Machine-readable diagnostics: `npm run doctor`
22
- - Go bootstrapper scaffold: `installer/` with `setup`, `doctor --json`, `start`, `stop`, `status`, `repair`, `upgrade`, `uninstall`, `print-client-config`, and verified bundle download commands.
41
+ - Go bootstrapper: `installer/` with `setup`, `doctor --json`, `start`, `stop`, `status`, `repair`, `upgrade`, `uninstall`, `print-client-config`, and verified bundle download commands.
23
42
  - Release packaging: the release workflow builds Linux binaries on the local self-hosted runner, builds Windows/macOS binaries on GitHub-hosted runners, then publishes checksummed platform bundles with the gateway, production dependencies, and a managed Node runtime; see [installer/packaging/README.md](installer/packaging/README.md).
24
43
  - Docker Compose fallback: [docker-compose.personal.yml](docker-compose.personal.yml) + [Dockerfile.personal](Dockerfile.personal) for users who already manage containers.
25
44
  - Local setup UI artifact: [setup/ui/index.html](setup/ui/index.html)
@@ -34,11 +53,25 @@ Windows PowerShell:
34
53
  $Version = '<version>'
35
54
  $Base = "https://github.com/verivus-oss/llm-cli-gateway/releases/download/v$Version"
36
55
  $InstallDir = Join-Path (Join-Path $env:LOCALAPPDATA 'Programs') 'llm-cli-gateway'
56
+ $ExeName = "llm-cli-gateway-$Version-windows-amd64.exe"
57
+ $BundleName = "llm-cli-gateway-bundle-$Version-windows-amd64.tar.gz"
37
58
  $Exe = Join-Path $InstallDir 'llm-cli-gateway.exe'
59
+ $Checksums = Join-Path $InstallDir 'SHA256SUMS'
60
+ $ChecksumBundle = Join-Path $InstallDir 'SHA256SUMS.sigstore.json'
38
61
  New-Item -ItemType Directory -Force $InstallDir | Out-Null
39
- Invoke-WebRequest -UseBasicParsing "$Base/llm-cli-gateway-$Version-windows-amd64.exe" -OutFile $Exe
40
- $env:RVWR_GATEWAY_BUNDLE_URL = "$Base/llm-cli-gateway-bundle-$Version-windows-amd64.tar.gz"
41
- $env:RVWR_GATEWAY_BUNDLE_SHA256 = '<bundle-sha256-from-SHA256SUMS>'
62
+ Invoke-WebRequest -UseBasicParsing "$Base/$ExeName" -OutFile $Exe
63
+ Invoke-WebRequest -UseBasicParsing "$Base/SHA256SUMS" -OutFile $Checksums
64
+ Invoke-WebRequest -UseBasicParsing "$Base/SHA256SUMS.sigstore.json" -OutFile $ChecksumBundle
65
+ cosign verify-blob $Checksums --bundle $ChecksumBundle --certificate-identity "https://github.com/verivus-oss/llm-cli-gateway/.github/workflows/release-installer.yml@refs/tags/v$Version" --certificate-oidc-issuer "https://token.actions.githubusercontent.com"
66
+ if ($LASTEXITCODE -ne 0) { throw "Sigstore verification failed for SHA256SUMS" }
67
+ function Get-ReleaseSha256($Name) {
68
+ $line = Select-String -Path $Checksums -Pattern "^[a-fA-F0-9]{64}\s+$([regex]::Escape($Name))$" | Select-Object -First 1
69
+ if (-not $line) { throw "No SHA256SUMS entry found for $Name" }
70
+ return (($line.Line -split "\s+")[0]).ToLowerInvariant()
71
+ }
72
+ if ((Get-FileHash $Exe -Algorithm SHA256).Hash.ToLowerInvariant() -ne (Get-ReleaseSha256 $ExeName)) { throw "Checksum mismatch for $ExeName" }
73
+ $env:RVWR_GATEWAY_BUNDLE_URL = "$Base/$BundleName"
74
+ $env:RVWR_GATEWAY_BUNDLE_SHA256 = Get-ReleaseSha256 $BundleName
42
75
  & $Exe setup
43
76
  & $Exe stop
44
77
  & $Exe install-bundle
@@ -53,6 +86,9 @@ PATH. Do not script against release-versioned exe names after install.
53
86
 
54
87
  ```bash
55
88
  # After downloading the binary that matches your OS/arch from a release:
89
+ cosign verify-blob SHA256SUMS --bundle SHA256SUMS.sigstore.json \
90
+ --certificate-identity "https://github.com/verivus-oss/llm-cli-gateway/.github/workflows/release-installer.yml@refs/tags/v<version>" \
91
+ --certificate-oidc-issuer "https://token.actions.githubusercontent.com"
56
92
  sha256sum --check SHA256SUMS # verify before run (or `shasum -a 256 --check` on macOS)
57
93
  chmod +x llm-cli-gateway-<ver>-<os>-<arch>
58
94
  ./llm-cli-gateway-<ver>-<os>-<arch> setup
@@ -79,13 +115,16 @@ docker compose -f docker-compose.personal.yml run --rm doctor
79
115
  ## Features
80
116
 
81
117
  ### Core Capabilities
118
+
82
119
  - **Multi-LLM Orchestration**: Unified interface for Claude Code, Codex, Gemini, Grok, and Mistral (Vibe) CLIs
83
120
  - **Session Management**: Track and resume conversations across all CLIs with persistent storage
121
+ - **Gateway-owned worktrees**: Run any sync or async provider request inside a managed git worktree, with per-session reuse and cleanup
84
122
  - **Token Optimization**: Automatic 44% reduction on prompts, 37% on responses (opt-in)
85
123
  - **Correlation ID Tracking**: Full request tracing across all LLM interactions
86
124
  - **Cross-Tool Collaboration**: LLMs can use each other via MCP (validated through dogfooding)
87
125
 
88
126
  ### Observability
127
+
89
128
  - **SQLite Flight Recorder**: Every request/response logged to `~/.llm-cli-gateway/logs.db` with correlation IDs, token usage, duration, retry counts, and circuit breaker state. Browse with [Datasette](https://datasette.io/): `datasette ~/.llm-cli-gateway/logs.db`
90
129
  - **Structured Metadata**: Tool responses include machine-readable `structuredContent` (model, cli, correlationId, sessionId, durationMs, token counts)
91
130
  - **Cache observability resources**: `cache_state://global`, `cache_state://session/{id}`, and `cache_state://prefix/{hash}` MCP resources return aggregate cache hit/miss/savings — tokens and hashes only, no prompt text. `session_get` includes a `cacheState` block when the session has prior requests.
@@ -109,17 +148,18 @@ Every `*_request` and `*_request_async` tool accepts an optional `promptParts` f
109
148
 
110
149
  Per-CLI capability matrix:
111
150
 
112
- | CLI | Prefix discipline (auto via `promptParts`) | Explicit `cache_control` emission |
113
- |---------|--------------------------------------------|------------------------------------|
114
- | claude | yes | not yet (Branch B; gated on `[cache_awareness].emit_anthropic_cache_control`) |
115
- | codex | yes | n/a (OpenAI implicit cache, no CLI lever) |
116
- | gemini | yes | n/a (implicit prefix cache server-side) |
117
- | grok | yes | n/a (no surfaced cache lever) |
118
- | mistral | yes | n/a (no surfaced cache lever) |
151
+ | CLI | Prefix discipline (auto via `promptParts`) | Explicit `cache_control` emission |
152
+ | ------- | ------------------------------------------ | ---------------------------------------------------------------------------- |
153
+ | claude | yes | yes, opt-in via `promptParts.cacheControl` and `outputFormat: "stream-json"` |
154
+ | codex | yes | n/a (OpenAI implicit cache, no CLI lever) |
155
+ | gemini | yes | n/a (implicit prefix cache server-side) |
156
+ | grok | yes | n/a (no surfaced cache lever) |
157
+ | mistral | yes | n/a (no surfaced cache lever) |
119
158
 
120
159
  Opt-in flags (all default off) live under `[cache_awareness]` in `~/.llm-cli-gateway/config.toml`. See `docs/personal-mcp/PROVIDER_CACHE_SURFACES.md` for the per-model minimum cacheable token thresholds and field-name divergences.
121
160
 
122
161
  ### Reliability & Performance
162
+
123
163
  - **Retry Logic**: Exponential backoff with circuit breaker for transient failures
124
164
  - **Atomic File Writes**: Process-specific temp files with fsync for data integrity
125
165
  - **Memory Limits**: 50MB cap on CLI output prevents DoS attacks
@@ -127,7 +167,8 @@ Opt-in flags (all default off) live under `[cache_awareness]` in `~/.llm-cli-gat
127
167
  - **Long-Running Jobs**: Non-time-bound async execution via `*_request_async` + polling tools
128
168
 
129
169
  ### Security & Quality
130
- - **Comprehensive Testing**: 681 tests covering unit, integration, and regression scenarios with real CLI execution
170
+
171
+ - **Comprehensive Testing**: 900+ tests covering unit, integration, and regression scenarios with real CLI execution
131
172
  - **Input Validation**: Zod schemas prevent injection attacks
132
173
  - **No Secret Leakage**: Generic session descriptions only (file permissions 0o600)
133
174
  - **No ReDoS**: Bounded regex patterns prevent catastrophic backtracking
@@ -139,6 +180,7 @@ Opt-in flags (all default off) live under `[cache_awareness]` in `~/.llm-cli-gat
139
180
  Before using this gateway, you need to install the CLI tools you want to use:
140
181
 
141
182
  ### Claude Code CLI
183
+
142
184
  ```bash
143
185
  # Installation instructions for Claude Code
144
186
  # Visit: https://docs.anthropic.com/claude-code
@@ -146,18 +188,21 @@ npm install -g @anthropic-ai/claude-code
146
188
  ```
147
189
 
148
190
  ### Codex CLI
191
+
149
192
  ```bash
150
193
  npm install -g @openai/codex
151
194
  codex login
152
195
  ```
153
196
 
154
197
  ### Gemini CLI
198
+
155
199
  ```bash
156
200
  npm install -g @google/gemini-cli
157
201
  # Or: https://github.com/google-gemini/gemini-cli
158
202
  ```
159
203
 
160
204
  ### Grok CLI (xAI)
205
+
161
206
  ```bash
162
207
  npm install -g grok-build
163
208
  grok login # OAuth flow, or set GROK_CODE_XAI_API_KEY
@@ -165,6 +210,7 @@ grok login # OAuth flow, or set GROK_CODE_XAI_API_KEY
165
210
  ```
166
211
 
167
212
  ### Mistral Vibe CLI
213
+
168
214
  ```bash
169
215
  # Pick one — the gateway's cli_upgrade auto-detects which one you used.
170
216
  pip install vibe-cli
@@ -184,7 +230,7 @@ Vibe-specific notes:
184
230
  requested or Vibe config needs recovery, and retries once after a
185
231
  model-not-found failure with refreshed discovery.
186
232
  - **`permissionMode` accepts** `default | plan | accept-edits | auto-approve |
187
- chat | explore | lean` and emits `--agent <mode>`. The gateway's
233
+ chat | explore | lean` and emits `--agent <mode>`. The gateway's
188
234
  programmatic-mode default is `auto-approve`; pick a stricter mode
189
235
  explicitly if you need approval gates.
190
236
  - **`allowedTools` is allow-list only** — the gateway emits one
@@ -198,11 +244,13 @@ Vibe-specific notes:
198
244
  ## Installation
199
245
 
200
246
  ### As an MCP server (npm)
247
+
201
248
  ```bash
202
249
  npm install -g llm-cli-gateway
203
250
  ```
204
251
 
205
252
  Or use directly with `npx`:
253
+
206
254
  ```json
207
255
  {
208
256
  "mcpServers": {
@@ -215,6 +263,7 @@ Or use directly with `npx`:
215
263
  ```
216
264
 
217
265
  ### From source
266
+
218
267
  ```bash
219
268
  git clone https://github.com/verivus-oss/llm-cli-gateway.git
220
269
  cd llm-cli-gateway
@@ -239,7 +288,7 @@ For clients that already support local stdio MCP servers, add a configuration li
239
288
  }
240
289
  ```
241
290
 
242
- This generic stdio example is not provider-support verification for the Personal MCP Appliance MVP. Client-specific setup guides for ChatGPT, Claude web, Claude Desktop, Codex, Gemini CLI, Gemini web, and Grok remain gated by the provider-support matrix in [docs/personal-mcp/PRODUCT_CONTRACT.md](docs/personal-mcp/PRODUCT_CONTRACT.md).
291
+ This generic stdio example is not provider-support verification for the Personal MCP Appliance. Client-specific setup guides for ChatGPT, Claude web, Claude Desktop, Codex, Gemini CLI, Gemini web, and Grok remain gated by the provider-support matrix in [docs/personal-mcp/PRODUCT_CONTRACT.md](docs/personal-mcp/PRODUCT_CONTRACT.md).
243
292
 
244
293
  ### Available Tools
245
294
 
@@ -260,9 +309,11 @@ The validation report preserves per-provider disagreement. Optional judge synthe
260
309
  #### LLM Request Tools
261
310
 
262
311
  ##### `claude_request`
312
+
263
313
  Execute a Claude Code request with optional session management.
264
314
 
265
315
  **Parameters:**
316
+
266
317
  - `prompt` (string, required): The prompt to send (1-100,000 chars)
267
318
  - `model` (string, optional): Model name or alias (use `list_models` for available values; supports `latest`)
268
319
  - `outputFormat` (string, optional): Output format ("text" or "json"), default: "text"
@@ -281,10 +332,12 @@ Execute a Claude Code request with optional session management.
281
332
  - `correlationId` (string, optional): Request trace ID (auto-generated if omitted)
282
333
 
283
334
  **Response extras:**
335
+
284
336
  - `approval`: Approval decision record when `approvalStrategy="mcp_managed"`
285
337
  - `mcpServers`: Requested/enabled/missing MCP servers for this call
286
338
 
287
339
  **Example:**
340
+
288
341
  ```json
289
342
  {
290
343
  "prompt": "Write a Python function to calculate fibonacci numbers",
@@ -296,9 +349,11 @@ Execute a Claude Code request with optional session management.
296
349
  ```
297
350
 
298
351
  ##### `codex_request`
352
+
299
353
  Execute a Codex request with optional session tracking.
300
354
 
301
355
  **Parameters:**
356
+
302
357
  - `prompt` (string, required): The prompt to send (1-100,000 chars)
303
358
  - `model` (string, optional): Model name or alias (use `list_models` for available values; supports `latest`, recommended: `gpt-5.4`)
304
359
  - `fullAuto` (boolean, optional): Enable full-auto mode, default: false
@@ -314,10 +369,12 @@ Execute a Codex request with optional session tracking.
314
369
  - `idleTimeoutMs` (number, optional): Kill a stuck Codex process after output inactivity; 30,000 to 3,600,000 ms
315
370
 
316
371
  **Response extras:**
372
+
317
373
  - `approval`: Approval decision record when `approvalStrategy="mcp_managed"`
318
374
  - `mcpServers`: Requested MCP servers for this call
319
375
 
320
376
  **Example:**
377
+
321
378
  ```json
322
379
  {
323
380
  "prompt": "Create a REST API endpoint",
@@ -328,9 +385,11 @@ Execute a Codex request with optional session tracking.
328
385
  ```
329
386
 
330
387
  ##### `gemini_request`
388
+
331
389
  Execute a Gemini CLI request with session support.
332
390
 
333
391
  **Parameters:**
392
+
334
393
  - `prompt` (string, required): The prompt to send (1-100,000 chars)
335
394
  - `model` (string, optional): Model name or alias (use `list_models` for available values; supports `latest`, `pro`, `flash`)
336
395
  - `sessionId` (string, optional): Session ID to resume
@@ -347,10 +406,12 @@ Execute a Gemini CLI request with session support.
347
406
  - `correlationId` (string, optional): Request trace ID (auto-generated if omitted)
348
407
 
349
408
  **Response extras:**
409
+
350
410
  - `approval`: Approval decision record when `approvalStrategy="mcp_managed"`
351
411
  - `mcpServers`: Requested MCP servers for this call
352
412
 
353
413
  **Example:**
414
+
354
415
  ```json
355
416
  {
356
417
  "prompt": "Explain quantum computing",
@@ -361,9 +422,11 @@ Execute a Gemini CLI request with session support.
361
422
  ```
362
423
 
363
424
  ##### `grok_request`
425
+
364
426
  Execute a Grok CLI (xAI) request with session support.
365
427
 
366
428
  **Parameters:**
429
+
367
430
  - `prompt` (string, required): The prompt to send (1-100,000 chars)
368
431
  - `model` (string, optional): Model name or alias (e.g. `grok-build`, `latest`)
369
432
  - `outputFormat` (string, optional): `"plain"` (default), `"json"`, or `"streaming-json"`
@@ -384,6 +447,7 @@ Execute a Grok CLI (xAI) request with session support.
384
447
  - `correlationId` (string, optional): Request trace ID (auto-generated if omitted)
385
448
 
386
449
  **Example:**
450
+
387
451
  ```json
388
452
  {
389
453
  "prompt": "Summarize the latest commit message in 1 sentence",
@@ -397,7 +461,7 @@ Execute a Grok CLI (xAI) request with session support.
397
461
  Every async job is persisted to a job store as it transitions through running → completed/failed/canceled. This makes the gateway a durable collection layer:
398
462
 
399
463
  - **Re-issuing a request is safe.** Identical `*_request` / `*_request_async` calls within the dedup window (default 1 hour) short-circuit onto the existing running or completed job — the caller gets back the same job ID instead of starting a duplicate run. This directly fixes the "agent times out polling, re-issues, and the whole job starts over" failure mode.
400
- - **`llm_job_status` and `llm_job_result` work across gateway restarts.** Job rows live for 30 days by default; callers can fetch results long after the in-memory cache has evicted them.
464
+ - **`llm_job_status` and `llm_job_result` work across gateway restarts.** Job rows live for 30 days by default; callers can collect results long after the in-memory cache has evicted them.
401
465
  - **Jobs running at shutdown are marked `orphaned`** on the next gateway boot (the detached child can't be reattached to). Their captured partial output remains readable.
402
466
  - **Pass `forceRefresh: true`** on any request tool to bypass dedup and force a fresh CLI run.
403
467
 
@@ -416,12 +480,14 @@ acknowledgeEphemeral = false # required to enable async tools wit
416
480
  ```
417
481
 
418
482
  Backends:
483
+
419
484
  - **`sqlite`** (default) — durable, file-backed. Safe for single-instance deployments.
420
485
  - **`memory`** — in-process Map. Lost on gateway exit. Requires `acknowledgeEphemeral = true` to be loaded. Suitable for tests and ephemeral CI gateways.
421
486
  - **`postgres`** — interface only, implementation not yet shipped. Selecting this backend throws at startup.
422
487
  - **`none`** — no store. **`*_request_async`, `llm_job_status`, `llm_job_result`, and `llm_job_cancel` are NOT registered on the gateway.** This is a structural invariant: agents that try to call async tools against a gateway with `backend = "none"` get a clean "tool not found" at connect time instead of silent in-memory loss after the 1-hour TTL. Use `llm_process_health` to inspect the resolved persistence state programmatically.
423
488
 
424
489
  Legacy environment variables (deprecated; emit a warning at startup):
490
+
425
491
  - `LLM_GATEWAY_LOGS_DB` / `LLM_GATEWAY_JOBS_DB` — `none` selects `backend = "none"`; any other value selects `backend = "sqlite"` with that path.
426
492
  - `LLM_GATEWAY_JOB_RETENTION_DAYS` — overrides `retentionDays`.
427
493
  - `LLM_GATEWAY_DEDUP_WINDOW_MS` — overrides `dedupWindowMs`.
@@ -459,7 +525,7 @@ backend = "sqlite"
459
525
  path = "/srv/repos/.../my-repo/.gateway/logs.db"
460
526
  ```
461
527
 
462
- Now every gateway subprocess spawned for *this* repo's Claude Code window reads its own config and writes to its own SQLite file; sessions, jobs, and dedup state are scoped to the repo. Other repos keep using the global default. `llm_process_health.persistence.sources.configFile` lets an agent confirm which config it's actually running under.
528
+ Now every gateway subprocess spawned for _this_ repo's Claude Code window reads its own config and writes to its own SQLite file; sessions, jobs, and dedup state are scoped to the repo. Other repos keep using the global default. `llm_process_health.persistence.sources.configFile` lets an agent confirm which config it's actually running under.
463
529
 
464
530
  ###### Agent-executable spec (DAG-TOML)
465
531
 
@@ -472,7 +538,7 @@ template_kind = "implementation-dag"
472
538
  docs = "https://github.com/verivus-oss/agent-assurance/blob/main/SPEC.md"
473
539
  confidentiality = "public"
474
540
  title = "Per-project llm-cli-gateway persistence isolation"
475
- spec = "https://github.com/verivusai-labs/llm-cli-gateway#per-project-isolation"
541
+ spec = "https://github.com/verivus-oss/llm-cli-gateway#per-project-isolation"
476
542
  created = "YYYY-MM-DD"
477
543
  total_units = 5
478
544
  tier1_units = ["U01","U02","U03","U04","U05"]
@@ -623,6 +689,7 @@ consumes = ["OUT:mcp-reconnected"]
623
689
  **Why this matters for agents:** the gateway has multiple configuration surfaces (TOML file, env-var overrides, two different MCP settings files) and one easy mistake — editing the committed `.mcp.json` instead of the local-only `.claude/settings.local.json` — will silently break the per-project scope for every other developer on the repo. The DAG above encodes the correct sequence, the verification gate, and the failure modes explicitly so an agent can execute it without inference.
624
690
 
625
691
  ##### `mistral_request`
692
+
626
693
  Run a Mistral Vibe agentic coding request. Like `grok_request` in shape, but with Vibe's specific surface:
627
694
 
628
695
  - `model` (string, optional): Vibe model alias (for example `mistral-medium-3.5` or `latest`). The resolved value is injected via the `VIBE_ACTIVE_MODEL` environment variable; omit it to let the gateway discover Vibe config and avoid stale hardcoded defaults.
@@ -632,33 +699,41 @@ Run a Mistral Vibe agentic coding request. Like `grok_request` in shape, but wit
632
699
  - `sessionId` / `resumeLatest` / `createNewSession`: standard session controls. Continuity requires `[session_logging] enabled = true` in `~/.vibe/config.toml` — `doctor --json` surfaces an actionable next-action when the toggle is missing.
633
700
 
634
701
  ##### `claude_request_async` / `codex_request_async` / `gemini_request_async` / `grok_request_async` / `mistral_request_async`
702
+
635
703
  Start a long-running Claude, Codex, Gemini, Grok, or Mistral request without waiting for completion in the same MCP call.
636
704
 
637
705
  Use this flow when analysis/runtime can exceed client tool-call limits:
706
+
638
707
  1. Start job with `*_request_async`
639
708
  2. Poll with `llm_job_status`
640
709
  3. Fetch output with `llm_job_result`
641
710
  4. Optionally stop with `llm_job_cancel`
642
711
 
643
712
  Async request tools accept the same approval strategy fields as their sync variants:
713
+
644
714
  - `approvalStrategy`: `"legacy"` (default) or `"mcp_managed"`
645
715
  - `approvalPolicy`: `"strict"|"balanced"|"permissive"` override
646
716
  - `mcpServers`: Requested MCP servers (`sqry`, `exa`, `ref_tools`, `trstr`)
647
717
  - `claude_request_async` also supports `strictMcpConfig` and fails fast when requested servers are unavailable
648
718
 
649
719
  ##### `llm_job_status`
720
+
650
721
  Return lifecycle status (`running`, `completed`, `failed`, `canceled`) and metadata for an async job.
651
722
 
652
723
  ##### `llm_job_result`
724
+
653
725
  Return captured stdout/stderr for an async job (with configurable max chars per stream).
654
726
 
655
727
  ##### `llm_job_cancel`
728
+
656
729
  Cancel a running async job.
657
730
 
658
731
  ##### `approval_list`
732
+
659
733
  List recent MCP-managed approval decisions recorded by the gateway.
660
734
 
661
735
  **Parameters:**
736
+
662
737
  - `limit` (number, optional): Max records (1-500), default: 50
663
738
  - `cli` (string, optional): Filter by `"claude"`, `"codex"`, or `"gemini"`
664
739
 
@@ -667,14 +742,17 @@ Approval records are persisted to `~/.llm-cli-gateway/approvals.jsonl`.
667
742
  #### Session Management Tools
668
743
 
669
744
  ##### `session_create`
745
+
670
746
  Create a new session for a specific CLI.
671
747
 
672
748
  **Parameters:**
749
+
673
750
  - `cli` (string, required): CLI to create session for ("claude", "codex", "gemini", "grok", "mistral")
674
751
  - `description` (string, optional): Description for the session
675
752
  - `setAsActive` (boolean, optional): Set as active session, default: true
676
753
 
677
754
  **Example:**
755
+
678
756
  ```json
679
757
  {
680
758
  "cli": "claude",
@@ -684,50 +762,64 @@ Create a new session for a specific CLI.
684
762
  ```
685
763
 
686
764
  ##### `session_list`
765
+
687
766
  List all sessions, optionally filtered by CLI.
688
767
 
689
768
  **Parameters:**
769
+
690
770
  - `cli` (string, optional): Filter by CLI ("claude", "codex", "gemini", "grok", "mistral")
691
771
 
692
772
  **Response includes:**
773
+
693
774
  - Total session count
694
775
  - Session details (ID, CLI, description, timestamps, active status)
695
776
  - Active session IDs for each CLI
696
777
 
697
778
  ##### `session_set_active`
779
+
698
780
  Set the active session for a specific CLI.
699
781
 
700
782
  **Parameters:**
783
+
701
784
  - `cli` (string, required): CLI to set active session for
702
785
  - `sessionId` (string, required): Session ID to activate (or null to clear)
703
786
 
704
787
  ##### `session_get`
788
+
705
789
  Retrieve details for a specific session.
706
790
 
707
791
  **Parameters:**
792
+
708
793
  - `sessionId` (string, required): Session ID to retrieve
709
794
 
710
795
  ##### `session_delete`
796
+
711
797
  Delete a specific session.
712
798
 
713
799
  **Parameters:**
800
+
714
801
  - `sessionId` (string, required): Session ID to delete
715
802
 
716
803
  ##### `session_clear_all`
804
+
717
805
  Clear all sessions, optionally for a specific CLI.
718
806
 
719
807
  **Parameters:**
808
+
720
809
  - `cli` (string, optional): Clear sessions for specific CLI only
721
810
 
722
811
  #### Utility Tools
723
812
 
724
813
  ##### `list_models`
814
+
725
815
  List available models for each CLI.
726
816
 
727
817
  **Parameters:**
818
+
728
819
  - `cli` (string, optional): Specific CLI to list models for ("claude", "codex", "gemini", "grok", "mistral")
729
820
 
730
821
  **Response includes:**
822
+
731
823
  - Model names and descriptions
732
824
  - Best use cases for each model
733
825
  - CLI-specific information
@@ -764,21 +856,26 @@ LLM_GATEWAY_DISABLE_MODEL_DISCOVERY=1
764
856
  ```
765
857
 
766
858
  ##### `cli_versions`
859
+
767
860
  Report installed CLI versions.
768
861
 
769
862
  **Parameters:**
863
+
770
864
  - `cli` (string, optional): Specific CLI to inspect ("claude", "codex", "gemini", "grok", "mistral")
771
865
 
772
866
  ##### `cli_upgrade`
867
+
773
868
  Plan or run an upgrade for one CLI.
774
869
 
775
870
  **Parameters:**
871
+
776
872
  - `cli` (string, required): CLI to upgrade ("claude", "codex", "gemini", "grok", "mistral")
777
873
  - `target` (string, optional): Package tag/version/target, default: `latest`
778
874
  - `dryRun` (boolean, optional): Return the upgrade plan without running it, default: `true`
779
875
  - `timeoutMs` (number, optional): Upgrade timeout when `dryRun=false`
780
876
 
781
877
  **Upgrade strategies:**
878
+
782
879
  - Claude latest: `claude update`
783
880
  - Claude explicit target: `claude install <target>`
784
881
  - Codex latest: `codex update`
@@ -786,6 +883,7 @@ Plan or run an upgrade for one CLI.
786
883
  - Gemini: `npm install -g @google/gemini-cli@<target>`
787
884
 
788
885
  **Example dry run:**
886
+
789
887
  ```json
790
888
  {
791
889
  "cli": "gemini",
@@ -810,7 +908,7 @@ Plan or run an upgrade for one CLI.
810
908
  await callTool("session_create", {
811
909
  cli: "claude",
812
910
  description: "Debugging session",
813
- setAsActive: true
911
+ setAsActive: true,
814
912
  });
815
913
 
816
914
  // 2. Make requests (automatically uses active session)
@@ -822,7 +920,7 @@ await callTool("claude_request", {
822
920
  // 3. Continue the conversation
823
921
  await callTool("claude_request", {
824
922
  prompt: "Can you explain that fix in more detail?",
825
- continueSession: true
923
+ continueSession: true,
826
924
  });
827
925
 
828
926
  // 4. List all sessions
@@ -831,12 +929,12 @@ await callTool("session_list", { cli: "claude" });
831
929
  // 5. Switch to a different session
832
930
  await callTool("session_set_active", {
833
931
  cli: "claude",
834
- sessionId: "some-other-session-id"
932
+ sessionId: "some-other-session-id",
835
933
  });
836
934
 
837
935
  // 6. Delete when done
838
936
  await callTool("session_delete", {
839
- sessionId: "session-id-to-delete"
937
+ sessionId: "session-id-to-delete",
840
938
  });
841
939
  ```
842
940
 
@@ -864,6 +962,7 @@ await callTool("session_delete", {
864
962
  ### CLI-Specific Settings
865
963
 
866
964
  Each CLI can be configured through its own configuration files:
965
+
867
966
  - Claude Code: `~/.claude/config.json`
868
967
  - Codex: `~/.codex/config.toml`
869
968
  - Gemini: `~/.gemini/config.json`
@@ -939,6 +1038,7 @@ npm start
939
1038
  The gateway provides detailed error messages for common issues:
940
1039
 
941
1040
  ### CLI Not Found
1041
+
942
1042
  ```
943
1043
  Error executing claude CLI:
944
1044
  spawn claude ENOENT
@@ -947,12 +1047,14 @@ The 'claude' command was not found. Please ensure claude CLI is installed and in
947
1047
  ```
948
1048
 
949
1049
  ### External Timeout / Legacy Timeout Option
1050
+
950
1051
  ```
951
1052
  Error executing codex CLI: Command timed out
952
1053
  Process timed out after 120000ms
953
1054
  ```
954
1055
 
955
1056
  ### Invalid Parameters
1057
+
956
1058
  ```
957
1059
  Prompt cannot be empty
958
1060
  Prompt too long (max 100k chars)
@@ -970,6 +1072,7 @@ Logs are written to stderr (stdout is reserved for MCP protocol):
970
1072
  ```
971
1073
 
972
1074
  Enable debug logging:
1075
+
973
1076
  ```bash
974
1077
  DEBUG=1 node dist/index.js
975
1078
  ```
@@ -979,6 +1082,7 @@ DEBUG=1 node dist/index.js
979
1082
  ### CLIs Not Found
980
1083
 
981
1084
  Make sure the CLIs are installed and in your PATH:
1085
+
982
1086
  ```bash
983
1087
  which claude
984
1088
  which codex
@@ -986,6 +1090,7 @@ which gemini
986
1090
  ```
987
1091
 
988
1092
  The gateway extends PATH to include common locations:
1093
+
989
1094
  - `~/.local/bin`
990
1095
  - `/usr/local/bin`
991
1096
  - `/usr/bin`
@@ -994,6 +1099,7 @@ The gateway extends PATH to include common locations:
994
1099
  ### Permission Errors
995
1100
 
996
1101
  If you encounter permission errors, ensure the CLI tools have proper permissions:
1102
+
997
1103
  ```bash
998
1104
  chmod +x $(which claude)
999
1105
  chmod +x $(which codex)
@@ -1005,16 +1111,19 @@ chmod +x $(which gemini)
1005
1111
  Sessions are stored in `~/.llm-cli-gateway/sessions.json`. If you encounter issues:
1006
1112
 
1007
1113
  1. Check file permissions:
1114
+
1008
1115
  ```bash
1009
1116
  ls -la ~/.llm-cli-gateway/
1010
1117
  ```
1011
1118
 
1012
1119
  2. Reset sessions:
1120
+
1013
1121
  ```bash
1014
1122
  rm ~/.llm-cli-gateway/sessions.json
1015
1123
  ```
1016
1124
 
1017
1125
  3. Or manually edit the session file:
1126
+
1018
1127
  ```bash
1019
1128
  cat ~/.llm-cli-gateway/sessions.json
1020
1129
  ```
@@ -1038,19 +1147,20 @@ The gateway supports concurrent requests across different CLIs. Each request spa
1038
1147
  - **No Eval**: No dynamic code evaluation in our source (see "Socket alerts" below for the transitive `ajv` codegen case)
1039
1148
  - **Sandboxing**: Consider running in containers for production use
1040
1149
  - **Provenance**: Releases are published with [npm provenance](https://docs.npmjs.com/generating-provenance-statements) via OIDC trusted publishing from GitHub Actions
1150
+ - **Release signing**: GitHub release installer artifacts are signed with Sigstore keyless signing; verify `SHA256SUMS.sigstore.json` before trusting the checksum file
1041
1151
 
1042
1152
  ### Socket alerts — context for reviewers
1043
1153
 
1044
1154
  If you're vetting `llm-cli-gateway` through [Socket](https://socket.dev/npm/package/llm-cli-gateway) or a similar supply-chain scanner, you'll see three behavioural alerts and some dependency-ownership alerts. They are accurate descriptions of what the package does and what it depends on; we've left them visible (not silenced in `socket.yml`) so you don't have to take our word for it. Here's the context for each:
1045
1155
 
1046
- | Alert | Where | Why it's bounded |
1047
- |---|---|---|
1048
- | **Network access** | `src/http-transport.ts` opens an HTTP MCP transport when started via `npm run start:http`. `src/endpoint-exposure.ts` issues a HEAD probe to verify configured public/tunnel URLs. | The transport binds to `127.0.0.1` by default and requires `LLM_GATEWAY_AUTH_TOKEN` to be set. The default stdio MCP entry point (`npm start`) opens no sockets. |
1049
- | **Shell access** | `src/executor.ts` uses `child_process.spawn(cmd, args, …)` to invoke the underlying LLM CLIs. | `spawn` is called with an argument array and **never** `shell: true`, so there is no shell interpolation path for caller input. The command name is restricted to an allow-list of known CLI binaries (`claude`, `codex`, `gemini`, `grok`, `vibe`). |
1050
- | **Uses eval** | None in our source. Transitive: `@modelcontextprotocol/sdk` → `ajv@8` uses `new Function(...)` in `ajv/dist/compile/index.js` to compile JSON Schema validators. | This is ajv's standard codegen path. Only known schemas (defined in our source and the MCP SDK) flow into it; no caller-supplied data ever reaches the compiled function body. |
1051
- | **better-sqlite3 PRAGMA helper** | Transitive: `better-sqlite3/lib/methods/pragma.js` interpolates its caller-provided `source` into a `PRAGMA ${source}` statement. | We do not call `db.pragma()` from production source. Internal SQLite setup uses fixed literal `db.exec("PRAGMA ...")` statements, and `npm run security:audit` fails the release if production code reintroduces `.pragma()` calls. |
1052
- | **ioredis obfuscated code** | Optional peer/dev dependency: `ioredis@5.10.1` may be flagged at `built/constants/TLSProfiles.js` for base64-looking strings. | Reviewed as a false positive. The file is a Redis Cloud TLS CA certificate bundle in PEM format, which is base64 by design. It contains no decoder loop, dynamic evaluation, network call, or hidden execution path. The same file is byte-for-byte identical in `ioredis@5.9.2`; our default production install does not install `ioredis`, and our code does not pass ioredis TLS profile options. |
1053
- | **Dependency ownership** | A handful of small transitive packages (e.g. `bindings` via `better-sqlite3`, `media-typer` via `@modelcontextprotocol/sdk`) trip Socket's "unstable ownership" or "obfuscated code" heuristics. | These are pinned, well-known micro-deps in the Node ecosystem with no known issues. We pin direct override versions of `content-type` and `type-is` in `package.json#overrides`. Our previous direct dependency on `toml@3.0.0` (also single-maintainer, last released 2020) was replaced with the actively-maintained `smol-toml` to reduce inherited risk. |
1156
+ | Alert | Where | Why it's bounded |
1157
+ | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
1158
+ | **Network access** | `src/http-transport.ts` opens an HTTP MCP transport when started via `npm run start:http`. `src/endpoint-exposure.ts` issues a HEAD probe to verify configured public/tunnel URLs. | The transport binds to `127.0.0.1` by default and requires `LLM_GATEWAY_AUTH_TOKEN` to be set. The default stdio MCP entry point (`npm start`) opens no sockets. |
1159
+ | **Shell access** | `src/executor.ts` uses `child_process.spawn(cmd, args, …)` to invoke the underlying LLM CLIs. | `spawn` is called with an argument array and **never** `shell: true`, so there is no shell interpolation path for caller input. The command name is restricted to an allow-list of known CLI binaries (`claude`, `codex`, `gemini`, `grok`, `vibe`). |
1160
+ | **Uses eval** | None in our source. Transitive: `@modelcontextprotocol/sdk` → `ajv@8` uses `new Function(...)` in `ajv/dist/compile/index.js` to compile JSON Schema validators. | This is ajv's standard codegen path. Only known schemas (defined in our source and the MCP SDK) flow into it; no caller-supplied data ever reaches the compiled function body. |
1161
+ | **better-sqlite3 PRAGMA helper** | Transitive: `better-sqlite3/lib/methods/pragma.js` interpolates its caller-provided `source` into a `PRAGMA ${source}` statement. | We do not call `db.pragma()` from production source. Internal SQLite setup uses fixed literal `db.exec("PRAGMA ...")` statements, and `npm run security:audit` fails the release if production code reintroduces `.pragma()` calls. |
1162
+ | **ioredis obfuscated code** | Optional peer/dev dependency: `ioredis@5.10.1` may be flagged at `built/constants/TLSProfiles.js` for base64-looking strings. | Reviewed as a false positive. The file is a Redis Cloud TLS CA certificate bundle in PEM format, which is base64 by design. It contains no decoder loop, dynamic evaluation, network call, or hidden execution path. The same file is byte-for-byte identical in `ioredis@5.9.2`; our default production install does not install `ioredis`, and our code does not pass ioredis TLS profile options. |
1163
+ | **Dependency ownership** | A handful of small transitive packages (e.g. `bindings` via `better-sqlite3`, `media-typer` via `@modelcontextprotocol/sdk`) trip Socket's "unstable ownership" or "obfuscated code" heuristics. | These are pinned, well-known micro-deps in the Node ecosystem with no known issues. We pin direct override versions of `content-type` and `type-is` in `package.json#overrides`. Our previous direct dependency on `toml@3.0.0` (also single-maintainer, last released 2020) was replaced with the actively-maintained `smol-toml` to reduce inherited risk. |
1054
1164
 
1055
1165
  See [`socket.yml`](./socket.yml) for the same context in machine-readable form.
1056
1166
 
@@ -1070,6 +1180,7 @@ MIT. See [LICENSE](LICENSE) for details.
1070
1180
  ## Support
1071
1181
 
1072
1182
  For issues and questions:
1183
+
1073
1184
  - Open an issue on GitHub
1074
1185
  - Check existing issues and documentation
1075
1186
  - Review CLI-specific documentation for CLI-related problems
@@ -51,7 +51,7 @@ function truncateText(value, maxChars) {
51
51
  return { text: value, truncated: false };
52
52
  }
53
53
  return {
54
- text: value.slice(value.length - maxChars),
54
+ text: value.slice(0, maxChars),
55
55
  truncated: true,
56
56
  };
57
57
  }
@@ -816,8 +816,9 @@ export class AsyncJobManager {
816
816
  job.error = "Output exceeded maximum size (50MB)";
817
817
  job.finishedAt = new Date().toISOString();
818
818
  job.clearIdleTimer?.();
819
- if (job.process)
819
+ if (job.process) {
820
820
  killProcessGroup(job.process, "SIGTERM");
821
+ }
821
822
  this.logger.info(`Job ${job.id} killed due to output overflow`, {
822
823
  correlationId: job.correlationId,
823
824
  });
@@ -825,11 +826,16 @@ export class AsyncJobManager {
825
826
  this.persistComplete(job);
826
827
  this.writeFlightComplete(job, "failed", "Output exceeded maximum size (50MB)");
827
828
  this.fireOnComplete(job);
828
- setTimeout(() => {
829
- if (!job.exited && job.process)
830
- killProcessGroup(job.process, "SIGKILL");
829
+ if (job.process) {
830
+ setTimeout(() => {
831
+ if (!job.exited && job.process)
832
+ killProcessGroup(job.process, "SIGKILL");
833
+ job.cleanupGroup?.();
834
+ }, 5000);
835
+ }
836
+ else {
831
837
  job.cleanupGroup?.();
832
- }, 5000);
838
+ }
833
839
  }
834
840
  return;
835
841
  }
package/dist/executor.js CHANGED
@@ -139,18 +139,48 @@ export function resolveCommandForSpawn(command, args, options = {}) {
139
139
  if ([".cmd", ".bat"].includes(extname(resolved).toLowerCase())) {
140
140
  return {
141
141
  command: "cmd.exe",
142
- args: ["/d", "/s", "/c", `"${buildWindowsCmdCommand(resolved, args)}"`],
142
+ args: [
143
+ "/d",
144
+ "/s",
145
+ "/c",
146
+ // Windows .cmd/.bat shims require cmd.exe. `buildWindowsCmdCommand`
147
+ // applies CommandLineToArgvW quoting and cmd metacharacter escaping
148
+ // to every dynamic segment before it reaches this shell boundary.
149
+ //
150
+ // codeql[js/shell-command-constructed-from-input]
151
+ `"${buildWindowsCmdCommand(resolved, args)}"`,
152
+ ],
143
153
  windowsVerbatimArguments: true,
144
154
  };
145
155
  }
146
156
  return { command: resolved, args };
147
157
  }
148
158
  function buildWindowsCmdCommand(command, args) {
159
+ // codeql[js/shell-command-constructed-from-input]
149
160
  return [escapeWindowsCmdCommand(command), ...args.map(escapeWindowsCmdArgument)].join(" ");
150
161
  }
151
- const WINDOWS_CMD_META_CHARS = /([()\][%!^"`<>&|;, *?])/g;
162
+ const WINDOWS_CMD_META_CHARS = new Set([
163
+ "(",
164
+ ")",
165
+ "]",
166
+ "[",
167
+ "%",
168
+ "!",
169
+ "^",
170
+ '"',
171
+ "`",
172
+ "<",
173
+ ">",
174
+ "&",
175
+ "|",
176
+ ";",
177
+ ",",
178
+ " ",
179
+ "*",
180
+ "?",
181
+ ]);
152
182
  function escapeWindowsCmdCommand(value) {
153
- return win32.normalize(value).replace(WINDOWS_CMD_META_CHARS, "^$1");
183
+ return escapeWindowsCmdMetaChars(win32.normalize(value));
154
184
  }
155
185
  // CommandLineToArgvW rules: a run of N backslashes before a literal " must be
156
186
  // doubled and followed by \" (yielding 2N+1 backslashes total, so the parser
@@ -158,11 +188,38 @@ function escapeWindowsCmdCommand(value) {
158
188
  // before the closing " must be doubled (2N) so the quote still terminates the
159
189
  // arg. Then wrap in quotes and caret-escape cmd.exe metacharacters.
160
190
  function escapeWindowsCmdArgument(value) {
161
- let arg = `${value}`;
162
- arg = arg.replace(/(\\*)"/g, '$1$1\\"');
163
- arg = arg.replace(/(\\*)$/, "$1$1");
164
- arg = `"${arg}"`;
165
- return arg.replace(WINDOWS_CMD_META_CHARS, "^$1");
191
+ return escapeWindowsCmdMetaChars(quoteWindowsArgForCommandLineToArgv(`${value}`));
192
+ }
193
+ function quoteWindowsArgForCommandLineToArgv(value) {
194
+ let encoded = "";
195
+ let backslashes = 0;
196
+ for (const ch of value) {
197
+ if (ch === "\\") {
198
+ backslashes += 1;
199
+ continue;
200
+ }
201
+ if (ch === '"') {
202
+ encoded += "\\".repeat(backslashes * 2 + 1);
203
+ encoded += '"';
204
+ backslashes = 0;
205
+ continue;
206
+ }
207
+ encoded += "\\".repeat(backslashes);
208
+ backslashes = 0;
209
+ encoded += ch;
210
+ }
211
+ encoded += "\\".repeat(backslashes * 2);
212
+ return `"${encoded}"`;
213
+ }
214
+ function escapeWindowsCmdMetaChars(value) {
215
+ let escaped = "";
216
+ for (const ch of value) {
217
+ if (WINDOWS_CMD_META_CHARS.has(ch)) {
218
+ escaped += "^";
219
+ }
220
+ escaped += ch;
221
+ }
222
+ return escaped;
166
223
  }
167
224
  function resolveWindowsCommandPath(command, envPath) {
168
225
  if (/[\\/]/.test(command)) {
package/dist/index.js CHANGED
@@ -486,7 +486,7 @@ cwd) {
486
486
  jobId: job.id,
487
487
  cli,
488
488
  correlationId: corrId,
489
- message: `Execution exceeded sync deadline (${SYNC_DEADLINE_MS}ms). Poll with llm_job_status, fetch with llm_job_result.`,
489
+ message: `Execution exceeded sync deadline (${SYNC_DEADLINE_MS}ms). Poll with llm_job_status, collect with llm_job_result.`,
490
490
  };
491
491
  }
492
492
  function isDeferredResponse(result) {
@@ -505,7 +505,7 @@ function buildDeferredToolResponse(deferred, sessionId) {
505
505
  message: deferred.message,
506
506
  sessionId: sessionId || null,
507
507
  pollWith: "llm_job_status",
508
- fetchWith: "llm_job_result",
508
+ collectWith: "llm_job_result",
509
509
  cancelWith: "llm_job_cancel",
510
510
  }, null, 2),
511
511
  },
package/dist/job-store.js CHANGED
@@ -245,7 +245,7 @@ export class SqliteJobStore {
245
245
  */
246
246
  markOrphanedOnStartup() {
247
247
  const now = new Date().toISOString();
248
- // Orphaned jobs retain a short window so callers can fetch the partial output,
248
+ // Orphaned jobs retain a short window so callers can collect the partial output,
249
249
  // then evict. Reuse the standard retention.
250
250
  const expiresAt = new Date(Date.now() + this.retentionMs).toISOString();
251
251
  // SELECT before UPDATE — gateway boot is single-threaded so no row can
@@ -626,10 +626,22 @@ export function prependGeminiAttachments(prompt, attachments) {
626
626
  if (!existsSync(p)) {
627
627
  throw new Error(`attachments: path does not exist: ${p}`);
628
628
  }
629
+ validateGeminiAttachmentTokenPath(p);
629
630
  }
630
631
  const tokens = attachments.map(p => `@${p}`).join(" ");
632
+ // Gemini attachments are prompt-level @path tokens rather than shell
633
+ // commands. Paths are absolute, existing, and token-safe before this join.
634
+ //
635
+ // codeql[js/shell-command-constructed-from-input]
631
636
  return `${tokens} ${prompt}`;
632
637
  }
638
+ function validateGeminiAttachmentTokenPath(path) {
639
+ for (const ch of path) {
640
+ if (ch === "@" || ch <= " ") {
641
+ throw new Error(`attachments: path cannot be represented as a Gemini @path token without escaping: ${path}`);
642
+ }
643
+ }
644
+ }
633
645
  /**
634
646
  * Zod schema for the U27 Gemini high-impact feature subset. Used by the
635
647
  * `gemini_request` / `gemini_request_async` tool schemas to validate the new
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "llm-cli-gateway",
3
- "version": "1.15.0",
3
+ "version": "1.15.2",
4
4
  "mcpName": "io.github.verivus-oss/llm-cli-gateway",
5
5
  "description": "MCP server providing unified access to Claude Code, Codex, Gemini, Grok, and Mistral Vibe CLIs with session management, retry logic, async job orchestration, durable job results, and cross-LLM validation.",
6
6
  "license": "MIT",
package/socket.yml CHANGED
@@ -14,24 +14,12 @@ version: 2
14
14
  # src/endpoint-exposure.ts also issues a HEAD probe when verifying
15
15
  # tunnel reachability — opt-in via the start:http entry point only.
16
16
  #
17
- # Additionally, Socket may flag `dist/index.js` and `dist/job-store.js`
18
- # against the `globalThis["fetch"]` rule. This is a substring-match
19
- # false positive (verified for v1.6.0 by sub-agent investigation on
20
- # 2026-05-26; same matches exist in v1.5.35). Neither file contains
21
- # any `fetch(`, `globalThis.fetch`, polyfill import, or any other
22
- # network-call construct. The matches are:
23
- # - dist/index.js — the English word "fetch" inside an async-defer
24
- # error message ("Poll with llm_job_status, fetch with
25
- # llm_job_result.") AND the JSON field name `fetchWith:
26
- # "llm_job_result"` (part of the deferred-job response contract).
27
- # - dist/job-store.js — the word "fetch" inside a code comment on
28
- # markOrphanedOnStartup() describing how callers retrieve partial
29
- # output from SQLite.
30
- # Verify with: `grep -rEn "\bfetch\(|globalThis\.fetch|globalThis\[" dist/`
31
- # — returns empty. Production code does not import undici / node-fetch
32
- # / axios / got. The cache-awareness slice (v1.6.0) introduced zero
33
- # new network surfaces; all I/O is filesystem (SQLite, sessions.json)
34
- # or in-process.
17
+ # Historical note: Socket previously flagged `dist/index.js` and
18
+ # `dist/job-store.js` because async-job prose used retrieval wording that
19
+ # resembled a browser-network primitive. The package now uses "collect" /
20
+ # `collectWith` wording for deferred job results. Production code does not
21
+ # import bundled HTTP client libraries; all default I/O is filesystem
22
+ # (SQLite, sessions.json) or explicit local CLI process I/O.
35
23
  #
36
24
  # shellAccess
37
25
  # src/executor.ts uses child_process.spawn(cmd, args, { ... }) with a