llm-cli-gateway 1.14.0 → 1.15.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +249 -46
- package/README.md +139 -29
- package/dist/async-job-manager.js +20 -8
- package/dist/executor.js +65 -8
- package/dist/index.d.ts +101 -0
- package/dist/index.js +311 -26
- package/dist/request-helpers.js +12 -0
- package/dist/session-manager.d.ts +20 -2
- package/dist/session-manager.js +28 -3
- package/dist/worktree-manager.d.ts +41 -0
- package/dist/worktree-manager.js +214 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,13 +1,31 @@
|
|
|
1
1
|
# llm-cli-gateway
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
[](https://github.com/verivus-oss/llm-cli-gateway/actions/workflows/ci.yml)
|
|
4
|
+
[](https://github.com/verivus-oss/llm-cli-gateway/actions/workflows/security.yml)
|
|
5
|
+
[](https://scorecard.dev/viewer/?uri=github.com/verivus-oss/llm-cli-gateway)
|
|
6
|
+
[](https://www.npmjs.com/package/llm-cli-gateway)
|
|
7
|
+
[](LICENSE)
|
|
8
|
+
[](SECURITY.md#release-signing)
|
|
9
|
+
|
|
10
|
+
> _"Without consultation, plans are frustrated, but with many counselors they succeed."_
|
|
4
11
|
> — Proverbs 15:22 (LSB)
|
|
5
12
|
|
|
6
|
-
A Model Context Protocol (MCP)
|
|
13
|
+
A Model Context Protocol (MCP) gateway for running Claude Code, Codex, Gemini, Grok, and Mistral (Vibe) CLIs from one MCP endpoint, with durable async jobs, session continuity, cache-aware prompting, observability, and personal-appliance setup tooling.
|
|
14
|
+
|
|
15
|
+
## What It Provides Today
|
|
16
|
+
|
|
17
|
+
`llm-cli-gateway` is a single-user MCP gateway for cross-LLM validation and multi-agent coding workflows. It is more than a thin CLI wrapper:
|
|
18
|
+
|
|
19
|
+
- Runs five provider CLIs through consistent sync and async MCP tools.
|
|
20
|
+
- Persists long-running jobs, supports restart-safe result collection, deduplication, cancellation, and sync-to-async deferral.
|
|
21
|
+
- Tracks sessions, real CLI resume paths, structured response metadata, and cache telemetry.
|
|
22
|
+
- Supports cache-aware `promptParts`, including explicit Claude `cache_control` when opted in.
|
|
23
|
+
- Can run requests inside gateway-managed git worktrees for isolated multi-agent review and implementation loops.
|
|
24
|
+
- Ships personal-appliance setup surfaces: HTTP transport with bearer-token auth, `doctor --json`, setup UI artifacts, provider setup snippets, Docker fallback, and checked release bundles.
|
|
7
25
|
|
|
8
26
|
## Personal MCP Appliance MVP
|
|
9
27
|
|
|
10
|
-
|
|
28
|
+
The personal-appliance contract keeps that surface intentionally narrow: one trusted user runs the gateway on a machine or volume they own, connects one MCP endpoint, and asks any connected client for cross-LLM validation.
|
|
11
29
|
|
|
12
30
|
The product contract is documented in [docs/personal-mcp/PRODUCT_CONTRACT.md](docs/personal-mcp/PRODUCT_CONTRACT.md). It defines the single-user scope, security posture, target support matrix, and provider-support verification gates. Public setup guides must not claim ChatGPT, Claude web, Claude Desktop, Codex, Gemini CLI, Gemini web, or Grok inbound support until the corresponding provider/client path has been verified.
|
|
13
31
|
|
|
@@ -19,7 +37,7 @@ Current personal-appliance artifacts include:
|
|
|
19
37
|
|
|
20
38
|
- Streamable HTTP startup: `LLM_GATEWAY_AUTH_TOKEN=<token> npm run start:http`
|
|
21
39
|
- Machine-readable diagnostics: `npm run doctor`
|
|
22
|
-
- Go bootstrapper
|
|
40
|
+
- Go bootstrapper: `installer/` with `setup`, `doctor --json`, `start`, `stop`, `status`, `repair`, `upgrade`, `uninstall`, `print-client-config`, and verified bundle download commands.
|
|
23
41
|
- Release packaging: the release workflow builds Linux binaries on the local self-hosted runner, builds Windows/macOS binaries on GitHub-hosted runners, then publishes checksummed platform bundles with the gateway, production dependencies, and a managed Node runtime; see [installer/packaging/README.md](installer/packaging/README.md).
|
|
24
42
|
- Docker Compose fallback: [docker-compose.personal.yml](docker-compose.personal.yml) + [Dockerfile.personal](Dockerfile.personal) for users who already manage containers.
|
|
25
43
|
- Local setup UI artifact: [setup/ui/index.html](setup/ui/index.html)
|
|
@@ -34,11 +52,25 @@ Windows PowerShell:
|
|
|
34
52
|
$Version = '<version>'
|
|
35
53
|
$Base = "https://github.com/verivus-oss/llm-cli-gateway/releases/download/v$Version"
|
|
36
54
|
$InstallDir = Join-Path (Join-Path $env:LOCALAPPDATA 'Programs') 'llm-cli-gateway'
|
|
55
|
+
$ExeName = "llm-cli-gateway-$Version-windows-amd64.exe"
|
|
56
|
+
$BundleName = "llm-cli-gateway-bundle-$Version-windows-amd64.tar.gz"
|
|
37
57
|
$Exe = Join-Path $InstallDir 'llm-cli-gateway.exe'
|
|
58
|
+
$Checksums = Join-Path $InstallDir 'SHA256SUMS'
|
|
59
|
+
$ChecksumBundle = Join-Path $InstallDir 'SHA256SUMS.sigstore.json'
|
|
38
60
|
New-Item -ItemType Directory -Force $InstallDir | Out-Null
|
|
39
|
-
Invoke-WebRequest -UseBasicParsing "$Base
|
|
40
|
-
|
|
41
|
-
$
|
|
61
|
+
Invoke-WebRequest -UseBasicParsing "$Base/$ExeName" -OutFile $Exe
|
|
62
|
+
Invoke-WebRequest -UseBasicParsing "$Base/SHA256SUMS" -OutFile $Checksums
|
|
63
|
+
Invoke-WebRequest -UseBasicParsing "$Base/SHA256SUMS.sigstore.json" -OutFile $ChecksumBundle
|
|
64
|
+
cosign verify-blob $Checksums --bundle $ChecksumBundle --certificate-identity "https://github.com/verivus-oss/llm-cli-gateway/.github/workflows/release-installer.yml@refs/tags/v$Version" --certificate-oidc-issuer "https://token.actions.githubusercontent.com"
|
|
65
|
+
if ($LASTEXITCODE -ne 0) { throw "Sigstore verification failed for SHA256SUMS" }
|
|
66
|
+
function Get-ReleaseSha256($Name) {
|
|
67
|
+
$line = Select-String -Path $Checksums -Pattern "^[a-fA-F0-9]{64}\s+$([regex]::Escape($Name))$" | Select-Object -First 1
|
|
68
|
+
if (-not $line) { throw "No SHA256SUMS entry found for $Name" }
|
|
69
|
+
return (($line.Line -split "\s+")[0]).ToLowerInvariant()
|
|
70
|
+
}
|
|
71
|
+
if ((Get-FileHash $Exe -Algorithm SHA256).Hash.ToLowerInvariant() -ne (Get-ReleaseSha256 $ExeName)) { throw "Checksum mismatch for $ExeName" }
|
|
72
|
+
$env:RVWR_GATEWAY_BUNDLE_URL = "$Base/$BundleName"
|
|
73
|
+
$env:RVWR_GATEWAY_BUNDLE_SHA256 = Get-ReleaseSha256 $BundleName
|
|
42
74
|
& $Exe setup
|
|
43
75
|
& $Exe stop
|
|
44
76
|
& $Exe install-bundle
|
|
@@ -53,6 +85,9 @@ PATH. Do not script against release-versioned exe names after install.
|
|
|
53
85
|
|
|
54
86
|
```bash
|
|
55
87
|
# After downloading the binary that matches your OS/arch from a release:
|
|
88
|
+
cosign verify-blob SHA256SUMS --bundle SHA256SUMS.sigstore.json \
|
|
89
|
+
--certificate-identity "https://github.com/verivus-oss/llm-cli-gateway/.github/workflows/release-installer.yml@refs/tags/v<version>" \
|
|
90
|
+
--certificate-oidc-issuer "https://token.actions.githubusercontent.com"
|
|
56
91
|
sha256sum --check SHA256SUMS # verify before run (or `shasum -a 256 --check` on macOS)
|
|
57
92
|
chmod +x llm-cli-gateway-<ver>-<os>-<arch>
|
|
58
93
|
./llm-cli-gateway-<ver>-<os>-<arch> setup
|
|
@@ -79,13 +114,16 @@ docker compose -f docker-compose.personal.yml run --rm doctor
|
|
|
79
114
|
## Features
|
|
80
115
|
|
|
81
116
|
### Core Capabilities
|
|
117
|
+
|
|
82
118
|
- **Multi-LLM Orchestration**: Unified interface for Claude Code, Codex, Gemini, Grok, and Mistral (Vibe) CLIs
|
|
83
119
|
- **Session Management**: Track and resume conversations across all CLIs with persistent storage
|
|
120
|
+
- **Gateway-owned worktrees**: Run any sync or async provider request inside a managed git worktree, with per-session reuse and cleanup
|
|
84
121
|
- **Token Optimization**: Automatic 44% reduction on prompts, 37% on responses (opt-in)
|
|
85
122
|
- **Correlation ID Tracking**: Full request tracing across all LLM interactions
|
|
86
123
|
- **Cross-Tool Collaboration**: LLMs can use each other via MCP (validated through dogfooding)
|
|
87
124
|
|
|
88
125
|
### Observability
|
|
126
|
+
|
|
89
127
|
- **SQLite Flight Recorder**: Every request/response logged to `~/.llm-cli-gateway/logs.db` with correlation IDs, token usage, duration, retry counts, and circuit breaker state. Browse with [Datasette](https://datasette.io/): `datasette ~/.llm-cli-gateway/logs.db`
|
|
90
128
|
- **Structured Metadata**: Tool responses include machine-readable `structuredContent` (model, cli, correlationId, sessionId, durationMs, token counts)
|
|
91
129
|
- **Cache observability resources**: `cache_state://global`, `cache_state://session/{id}`, and `cache_state://prefix/{hash}` MCP resources return aggregate cache hit/miss/savings — tokens and hashes only, no prompt text. `session_get` includes a `cacheState` block when the session has prior requests.
|
|
@@ -109,17 +147,18 @@ Every `*_request` and `*_request_async` tool accepts an optional `promptParts` f
|
|
|
109
147
|
|
|
110
148
|
Per-CLI capability matrix:
|
|
111
149
|
|
|
112
|
-
| CLI | Prefix discipline (auto via `promptParts`) | Explicit `cache_control` emission
|
|
113
|
-
|
|
114
|
-
| claude | yes |
|
|
115
|
-
| codex | yes | n/a (OpenAI implicit cache, no CLI lever)
|
|
116
|
-
| gemini | yes | n/a (implicit prefix cache server-side)
|
|
117
|
-
| grok | yes | n/a (no surfaced cache lever)
|
|
118
|
-
| mistral | yes | n/a (no surfaced cache lever)
|
|
150
|
+
| CLI | Prefix discipline (auto via `promptParts`) | Explicit `cache_control` emission |
|
|
151
|
+
| ------- | ------------------------------------------ | ---------------------------------------------------------------------------- |
|
|
152
|
+
| claude | yes | yes, opt-in via `promptParts.cacheControl` and `outputFormat: "stream-json"` |
|
|
153
|
+
| codex | yes | n/a (OpenAI implicit cache, no CLI lever) |
|
|
154
|
+
| gemini | yes | n/a (implicit prefix cache server-side) |
|
|
155
|
+
| grok | yes | n/a (no surfaced cache lever) |
|
|
156
|
+
| mistral | yes | n/a (no surfaced cache lever) |
|
|
119
157
|
|
|
120
158
|
Opt-in flags (all default off) live under `[cache_awareness]` in `~/.llm-cli-gateway/config.toml`. See `docs/personal-mcp/PROVIDER_CACHE_SURFACES.md` for the per-model minimum cacheable token thresholds and field-name divergences.
|
|
121
159
|
|
|
122
160
|
### Reliability & Performance
|
|
161
|
+
|
|
123
162
|
- **Retry Logic**: Exponential backoff with circuit breaker for transient failures
|
|
124
163
|
- **Atomic File Writes**: Process-specific temp files with fsync for data integrity
|
|
125
164
|
- **Memory Limits**: 50MB cap on CLI output prevents DoS attacks
|
|
@@ -127,7 +166,8 @@ Opt-in flags (all default off) live under `[cache_awareness]` in `~/.llm-cli-gat
|
|
|
127
166
|
- **Long-Running Jobs**: Non-time-bound async execution via `*_request_async` + polling tools
|
|
128
167
|
|
|
129
168
|
### Security & Quality
|
|
130
|
-
|
|
169
|
+
|
|
170
|
+
- **Comprehensive Testing**: 900+ tests covering unit, integration, and regression scenarios with real CLI execution
|
|
131
171
|
- **Input Validation**: Zod schemas prevent injection attacks
|
|
132
172
|
- **No Secret Leakage**: Generic session descriptions only (file permissions 0o600)
|
|
133
173
|
- **No ReDoS**: Bounded regex patterns prevent catastrophic backtracking
|
|
@@ -139,6 +179,7 @@ Opt-in flags (all default off) live under `[cache_awareness]` in `~/.llm-cli-gat
|
|
|
139
179
|
Before using this gateway, you need to install the CLI tools you want to use:
|
|
140
180
|
|
|
141
181
|
### Claude Code CLI
|
|
182
|
+
|
|
142
183
|
```bash
|
|
143
184
|
# Installation instructions for Claude Code
|
|
144
185
|
# Visit: https://docs.anthropic.com/claude-code
|
|
@@ -146,18 +187,21 @@ npm install -g @anthropic-ai/claude-code
|
|
|
146
187
|
```
|
|
147
188
|
|
|
148
189
|
### Codex CLI
|
|
190
|
+
|
|
149
191
|
```bash
|
|
150
192
|
npm install -g @openai/codex
|
|
151
193
|
codex login
|
|
152
194
|
```
|
|
153
195
|
|
|
154
196
|
### Gemini CLI
|
|
197
|
+
|
|
155
198
|
```bash
|
|
156
199
|
npm install -g @google/gemini-cli
|
|
157
200
|
# Or: https://github.com/google-gemini/gemini-cli
|
|
158
201
|
```
|
|
159
202
|
|
|
160
203
|
### Grok CLI (xAI)
|
|
204
|
+
|
|
161
205
|
```bash
|
|
162
206
|
npm install -g grok-build
|
|
163
207
|
grok login # OAuth flow, or set GROK_CODE_XAI_API_KEY
|
|
@@ -165,6 +209,7 @@ grok login # OAuth flow, or set GROK_CODE_XAI_API_KEY
|
|
|
165
209
|
```
|
|
166
210
|
|
|
167
211
|
### Mistral Vibe CLI
|
|
212
|
+
|
|
168
213
|
```bash
|
|
169
214
|
# Pick one — the gateway's cli_upgrade auto-detects which one you used.
|
|
170
215
|
pip install vibe-cli
|
|
@@ -184,7 +229,7 @@ Vibe-specific notes:
|
|
|
184
229
|
requested or Vibe config needs recovery, and retries once after a
|
|
185
230
|
model-not-found failure with refreshed discovery.
|
|
186
231
|
- **`permissionMode` accepts** `default | plan | accept-edits | auto-approve |
|
|
187
|
-
|
|
232
|
+
chat | explore | lean` and emits `--agent <mode>`. The gateway's
|
|
188
233
|
programmatic-mode default is `auto-approve`; pick a stricter mode
|
|
189
234
|
explicitly if you need approval gates.
|
|
190
235
|
- **`allowedTools` is allow-list only** — the gateway emits one
|
|
@@ -198,11 +243,13 @@ Vibe-specific notes:
|
|
|
198
243
|
## Installation
|
|
199
244
|
|
|
200
245
|
### As an MCP server (npm)
|
|
246
|
+
|
|
201
247
|
```bash
|
|
202
248
|
npm install -g llm-cli-gateway
|
|
203
249
|
```
|
|
204
250
|
|
|
205
251
|
Or use directly with `npx`:
|
|
252
|
+
|
|
206
253
|
```json
|
|
207
254
|
{
|
|
208
255
|
"mcpServers": {
|
|
@@ -215,6 +262,7 @@ Or use directly with `npx`:
|
|
|
215
262
|
```
|
|
216
263
|
|
|
217
264
|
### From source
|
|
265
|
+
|
|
218
266
|
```bash
|
|
219
267
|
git clone https://github.com/verivus-oss/llm-cli-gateway.git
|
|
220
268
|
cd llm-cli-gateway
|
|
@@ -260,9 +308,11 @@ The validation report preserves per-provider disagreement. Optional judge synthe
|
|
|
260
308
|
#### LLM Request Tools
|
|
261
309
|
|
|
262
310
|
##### `claude_request`
|
|
311
|
+
|
|
263
312
|
Execute a Claude Code request with optional session management.
|
|
264
313
|
|
|
265
314
|
**Parameters:**
|
|
315
|
+
|
|
266
316
|
- `prompt` (string, required): The prompt to send (1-100,000 chars)
|
|
267
317
|
- `model` (string, optional): Model name or alias (use `list_models` for available values; supports `latest`)
|
|
268
318
|
- `outputFormat` (string, optional): Output format ("text" or "json"), default: "text"
|
|
@@ -281,10 +331,12 @@ Execute a Claude Code request with optional session management.
|
|
|
281
331
|
- `correlationId` (string, optional): Request trace ID (auto-generated if omitted)
|
|
282
332
|
|
|
283
333
|
**Response extras:**
|
|
334
|
+
|
|
284
335
|
- `approval`: Approval decision record when `approvalStrategy="mcp_managed"`
|
|
285
336
|
- `mcpServers`: Requested/enabled/missing MCP servers for this call
|
|
286
337
|
|
|
287
338
|
**Example:**
|
|
339
|
+
|
|
288
340
|
```json
|
|
289
341
|
{
|
|
290
342
|
"prompt": "Write a Python function to calculate fibonacci numbers",
|
|
@@ -296,9 +348,11 @@ Execute a Claude Code request with optional session management.
|
|
|
296
348
|
```
|
|
297
349
|
|
|
298
350
|
##### `codex_request`
|
|
351
|
+
|
|
299
352
|
Execute a Codex request with optional session tracking.
|
|
300
353
|
|
|
301
354
|
**Parameters:**
|
|
355
|
+
|
|
302
356
|
- `prompt` (string, required): The prompt to send (1-100,000 chars)
|
|
303
357
|
- `model` (string, optional): Model name or alias (use `list_models` for available values; supports `latest`, recommended: `gpt-5.4`)
|
|
304
358
|
- `fullAuto` (boolean, optional): Enable full-auto mode, default: false
|
|
@@ -314,10 +368,12 @@ Execute a Codex request with optional session tracking.
|
|
|
314
368
|
- `idleTimeoutMs` (number, optional): Kill a stuck Codex process after output inactivity; 30,000 to 3,600,000 ms
|
|
315
369
|
|
|
316
370
|
**Response extras:**
|
|
371
|
+
|
|
317
372
|
- `approval`: Approval decision record when `approvalStrategy="mcp_managed"`
|
|
318
373
|
- `mcpServers`: Requested MCP servers for this call
|
|
319
374
|
|
|
320
375
|
**Example:**
|
|
376
|
+
|
|
321
377
|
```json
|
|
322
378
|
{
|
|
323
379
|
"prompt": "Create a REST API endpoint",
|
|
@@ -328,9 +384,11 @@ Execute a Codex request with optional session tracking.
|
|
|
328
384
|
```
|
|
329
385
|
|
|
330
386
|
##### `gemini_request`
|
|
387
|
+
|
|
331
388
|
Execute a Gemini CLI request with session support.
|
|
332
389
|
|
|
333
390
|
**Parameters:**
|
|
391
|
+
|
|
334
392
|
- `prompt` (string, required): The prompt to send (1-100,000 chars)
|
|
335
393
|
- `model` (string, optional): Model name or alias (use `list_models` for available values; supports `latest`, `pro`, `flash`)
|
|
336
394
|
- `sessionId` (string, optional): Session ID to resume
|
|
@@ -347,10 +405,12 @@ Execute a Gemini CLI request with session support.
|
|
|
347
405
|
- `correlationId` (string, optional): Request trace ID (auto-generated if omitted)
|
|
348
406
|
|
|
349
407
|
**Response extras:**
|
|
408
|
+
|
|
350
409
|
- `approval`: Approval decision record when `approvalStrategy="mcp_managed"`
|
|
351
410
|
- `mcpServers`: Requested MCP servers for this call
|
|
352
411
|
|
|
353
412
|
**Example:**
|
|
413
|
+
|
|
354
414
|
```json
|
|
355
415
|
{
|
|
356
416
|
"prompt": "Explain quantum computing",
|
|
@@ -361,9 +421,11 @@ Execute a Gemini CLI request with session support.
|
|
|
361
421
|
```
|
|
362
422
|
|
|
363
423
|
##### `grok_request`
|
|
424
|
+
|
|
364
425
|
Execute a Grok CLI (xAI) request with session support.
|
|
365
426
|
|
|
366
427
|
**Parameters:**
|
|
428
|
+
|
|
367
429
|
- `prompt` (string, required): The prompt to send (1-100,000 chars)
|
|
368
430
|
- `model` (string, optional): Model name or alias (e.g. `grok-build`, `latest`)
|
|
369
431
|
- `outputFormat` (string, optional): `"plain"` (default), `"json"`, or `"streaming-json"`
|
|
@@ -384,6 +446,7 @@ Execute a Grok CLI (xAI) request with session support.
|
|
|
384
446
|
- `correlationId` (string, optional): Request trace ID (auto-generated if omitted)
|
|
385
447
|
|
|
386
448
|
**Example:**
|
|
449
|
+
|
|
387
450
|
```json
|
|
388
451
|
{
|
|
389
452
|
"prompt": "Summarize the latest commit message in 1 sentence",
|
|
@@ -416,12 +479,14 @@ acknowledgeEphemeral = false # required to enable async tools wit
|
|
|
416
479
|
```
|
|
417
480
|
|
|
418
481
|
Backends:
|
|
482
|
+
|
|
419
483
|
- **`sqlite`** (default) — durable, file-backed. Safe for single-instance deployments.
|
|
420
484
|
- **`memory`** — in-process Map. Lost on gateway exit. Requires `acknowledgeEphemeral = true` to be loaded. Suitable for tests and ephemeral CI gateways.
|
|
421
485
|
- **`postgres`** — interface only, implementation not yet shipped. Selecting this backend throws at startup.
|
|
422
486
|
- **`none`** — no store. **`*_request_async`, `llm_job_status`, `llm_job_result`, and `llm_job_cancel` are NOT registered on the gateway.** This is a structural invariant: agents that try to call async tools against a gateway with `backend = "none"` get a clean "tool not found" at connect time instead of silent in-memory loss after the 1-hour TTL. Use `llm_process_health` to inspect the resolved persistence state programmatically.
|
|
423
487
|
|
|
424
488
|
Legacy environment variables (deprecated; emit a warning at startup):
|
|
489
|
+
|
|
425
490
|
- `LLM_GATEWAY_LOGS_DB` / `LLM_GATEWAY_JOBS_DB` — `none` selects `backend = "none"`; any other value selects `backend = "sqlite"` with that path.
|
|
426
491
|
- `LLM_GATEWAY_JOB_RETENTION_DAYS` — overrides `retentionDays`.
|
|
427
492
|
- `LLM_GATEWAY_DEDUP_WINDOW_MS` — overrides `dedupWindowMs`.
|
|
@@ -459,7 +524,7 @@ backend = "sqlite"
|
|
|
459
524
|
path = "/srv/repos/.../my-repo/.gateway/logs.db"
|
|
460
525
|
```
|
|
461
526
|
|
|
462
|
-
Now every gateway subprocess spawned for
|
|
527
|
+
Now every gateway subprocess spawned for _this_ repo's Claude Code window reads its own config and writes to its own SQLite file; sessions, jobs, and dedup state are scoped to the repo. Other repos keep using the global default. `llm_process_health.persistence.sources.configFile` lets an agent confirm which config it's actually running under.
|
|
463
528
|
|
|
464
529
|
###### Agent-executable spec (DAG-TOML)
|
|
465
530
|
|
|
@@ -623,6 +688,7 @@ consumes = ["OUT:mcp-reconnected"]
|
|
|
623
688
|
**Why this matters for agents:** the gateway has multiple configuration surfaces (TOML file, env-var overrides, two different MCP settings files) and one easy mistake — editing the committed `.mcp.json` instead of the local-only `.claude/settings.local.json` — will silently break the per-project scope for every other developer on the repo. The DAG above encodes the correct sequence, the verification gate, and the failure modes explicitly so an agent can execute it without inference.
|
|
624
689
|
|
|
625
690
|
##### `mistral_request`
|
|
691
|
+
|
|
626
692
|
Run a Mistral Vibe agentic coding request. Like `grok_request` in shape, but with Vibe's specific surface:
|
|
627
693
|
|
|
628
694
|
- `model` (string, optional): Vibe model alias (for example `mistral-medium-3.5` or `latest`). The resolved value is injected via the `VIBE_ACTIVE_MODEL` environment variable; omit it to let the gateway discover Vibe config and avoid stale hardcoded defaults.
|
|
@@ -632,33 +698,41 @@ Run a Mistral Vibe agentic coding request. Like `grok_request` in shape, but wit
|
|
|
632
698
|
- `sessionId` / `resumeLatest` / `createNewSession`: standard session controls. Continuity requires `[session_logging] enabled = true` in `~/.vibe/config.toml` — `doctor --json` surfaces an actionable next-action when the toggle is missing.
|
|
633
699
|
|
|
634
700
|
##### `claude_request_async` / `codex_request_async` / `gemini_request_async` / `grok_request_async` / `mistral_request_async`
|
|
701
|
+
|
|
635
702
|
Start a long-running Claude, Codex, Gemini, Grok, or Mistral request without waiting for completion in the same MCP call.
|
|
636
703
|
|
|
637
704
|
Use this flow when analysis/runtime can exceed client tool-call limits:
|
|
705
|
+
|
|
638
706
|
1. Start job with `*_request_async`
|
|
639
707
|
2. Poll with `llm_job_status`
|
|
640
708
|
3. Fetch output with `llm_job_result`
|
|
641
709
|
4. Optionally stop with `llm_job_cancel`
|
|
642
710
|
|
|
643
711
|
Async request tools accept the same approval strategy fields as their sync variants:
|
|
712
|
+
|
|
644
713
|
- `approvalStrategy`: `"legacy"` (default) or `"mcp_managed"`
|
|
645
714
|
- `approvalPolicy`: `"strict"|"balanced"|"permissive"` override
|
|
646
715
|
- `mcpServers`: Requested MCP servers (`sqry`, `exa`, `ref_tools`, `trstr`)
|
|
647
716
|
- `claude_request_async` also supports `strictMcpConfig` and fails fast when requested servers are unavailable
|
|
648
717
|
|
|
649
718
|
##### `llm_job_status`
|
|
719
|
+
|
|
650
720
|
Return lifecycle status (`running`, `completed`, `failed`, `canceled`) and metadata for an async job.
|
|
651
721
|
|
|
652
722
|
##### `llm_job_result`
|
|
723
|
+
|
|
653
724
|
Return captured stdout/stderr for an async job (with configurable max chars per stream).
|
|
654
725
|
|
|
655
726
|
##### `llm_job_cancel`
|
|
727
|
+
|
|
656
728
|
Cancel a running async job.
|
|
657
729
|
|
|
658
730
|
##### `approval_list`
|
|
731
|
+
|
|
659
732
|
List recent MCP-managed approval decisions recorded by the gateway.
|
|
660
733
|
|
|
661
734
|
**Parameters:**
|
|
735
|
+
|
|
662
736
|
- `limit` (number, optional): Max records (1-500), default: 50
|
|
663
737
|
- `cli` (string, optional): Filter by `"claude"`, `"codex"`, or `"gemini"`
|
|
664
738
|
|
|
@@ -667,14 +741,17 @@ Approval records are persisted to `~/.llm-cli-gateway/approvals.jsonl`.
|
|
|
667
741
|
#### Session Management Tools
|
|
668
742
|
|
|
669
743
|
##### `session_create`
|
|
744
|
+
|
|
670
745
|
Create a new session for a specific CLI.
|
|
671
746
|
|
|
672
747
|
**Parameters:**
|
|
748
|
+
|
|
673
749
|
- `cli` (string, required): CLI to create session for ("claude", "codex", "gemini", "grok", "mistral")
|
|
674
750
|
- `description` (string, optional): Description for the session
|
|
675
751
|
- `setAsActive` (boolean, optional): Set as active session, default: true
|
|
676
752
|
|
|
677
753
|
**Example:**
|
|
754
|
+
|
|
678
755
|
```json
|
|
679
756
|
{
|
|
680
757
|
"cli": "claude",
|
|
@@ -684,50 +761,64 @@ Create a new session for a specific CLI.
|
|
|
684
761
|
```
|
|
685
762
|
|
|
686
763
|
##### `session_list`
|
|
764
|
+
|
|
687
765
|
List all sessions, optionally filtered by CLI.
|
|
688
766
|
|
|
689
767
|
**Parameters:**
|
|
768
|
+
|
|
690
769
|
- `cli` (string, optional): Filter by CLI ("claude", "codex", "gemini", "grok", "mistral")
|
|
691
770
|
|
|
692
771
|
**Response includes:**
|
|
772
|
+
|
|
693
773
|
- Total session count
|
|
694
774
|
- Session details (ID, CLI, description, timestamps, active status)
|
|
695
775
|
- Active session IDs for each CLI
|
|
696
776
|
|
|
697
777
|
##### `session_set_active`
|
|
778
|
+
|
|
698
779
|
Set the active session for a specific CLI.
|
|
699
780
|
|
|
700
781
|
**Parameters:**
|
|
782
|
+
|
|
701
783
|
- `cli` (string, required): CLI to set active session for
|
|
702
784
|
- `sessionId` (string, required): Session ID to activate (or null to clear)
|
|
703
785
|
|
|
704
786
|
##### `session_get`
|
|
787
|
+
|
|
705
788
|
Retrieve details for a specific session.
|
|
706
789
|
|
|
707
790
|
**Parameters:**
|
|
791
|
+
|
|
708
792
|
- `sessionId` (string, required): Session ID to retrieve
|
|
709
793
|
|
|
710
794
|
##### `session_delete`
|
|
795
|
+
|
|
711
796
|
Delete a specific session.
|
|
712
797
|
|
|
713
798
|
**Parameters:**
|
|
799
|
+
|
|
714
800
|
- `sessionId` (string, required): Session ID to delete
|
|
715
801
|
|
|
716
802
|
##### `session_clear_all`
|
|
803
|
+
|
|
717
804
|
Clear all sessions, optionally for a specific CLI.
|
|
718
805
|
|
|
719
806
|
**Parameters:**
|
|
807
|
+
|
|
720
808
|
- `cli` (string, optional): Clear sessions for specific CLI only
|
|
721
809
|
|
|
722
810
|
#### Utility Tools
|
|
723
811
|
|
|
724
812
|
##### `list_models`
|
|
813
|
+
|
|
725
814
|
List available models for each CLI.
|
|
726
815
|
|
|
727
816
|
**Parameters:**
|
|
817
|
+
|
|
728
818
|
- `cli` (string, optional): Specific CLI to list models for ("claude", "codex", "gemini", "grok", "mistral")
|
|
729
819
|
|
|
730
820
|
**Response includes:**
|
|
821
|
+
|
|
731
822
|
- Model names and descriptions
|
|
732
823
|
- Best use cases for each model
|
|
733
824
|
- CLI-specific information
|
|
@@ -764,21 +855,26 @@ LLM_GATEWAY_DISABLE_MODEL_DISCOVERY=1
|
|
|
764
855
|
```
|
|
765
856
|
|
|
766
857
|
##### `cli_versions`
|
|
858
|
+
|
|
767
859
|
Report installed CLI versions.
|
|
768
860
|
|
|
769
861
|
**Parameters:**
|
|
862
|
+
|
|
770
863
|
- `cli` (string, optional): Specific CLI to inspect ("claude", "codex", "gemini", "grok", "mistral")
|
|
771
864
|
|
|
772
865
|
##### `cli_upgrade`
|
|
866
|
+
|
|
773
867
|
Plan or run an upgrade for one CLI.
|
|
774
868
|
|
|
775
869
|
**Parameters:**
|
|
870
|
+
|
|
776
871
|
- `cli` (string, required): CLI to upgrade ("claude", "codex", "gemini", "grok", "mistral")
|
|
777
872
|
- `target` (string, optional): Package tag/version/target, default: `latest`
|
|
778
873
|
- `dryRun` (boolean, optional): Return the upgrade plan without running it, default: `true`
|
|
779
874
|
- `timeoutMs` (number, optional): Upgrade timeout when `dryRun=false`
|
|
780
875
|
|
|
781
876
|
**Upgrade strategies:**
|
|
877
|
+
|
|
782
878
|
- Claude latest: `claude update`
|
|
783
879
|
- Claude explicit target: `claude install <target>`
|
|
784
880
|
- Codex latest: `codex update`
|
|
@@ -786,6 +882,7 @@ Plan or run an upgrade for one CLI.
|
|
|
786
882
|
- Gemini: `npm install -g @google/gemini-cli@<target>`
|
|
787
883
|
|
|
788
884
|
**Example dry run:**
|
|
885
|
+
|
|
789
886
|
```json
|
|
790
887
|
{
|
|
791
888
|
"cli": "gemini",
|
|
@@ -810,7 +907,7 @@ Plan or run an upgrade for one CLI.
|
|
|
810
907
|
await callTool("session_create", {
|
|
811
908
|
cli: "claude",
|
|
812
909
|
description: "Debugging session",
|
|
813
|
-
setAsActive: true
|
|
910
|
+
setAsActive: true,
|
|
814
911
|
});
|
|
815
912
|
|
|
816
913
|
// 2. Make requests (automatically uses active session)
|
|
@@ -822,7 +919,7 @@ await callTool("claude_request", {
|
|
|
822
919
|
// 3. Continue the conversation
|
|
823
920
|
await callTool("claude_request", {
|
|
824
921
|
prompt: "Can you explain that fix in more detail?",
|
|
825
|
-
continueSession: true
|
|
922
|
+
continueSession: true,
|
|
826
923
|
});
|
|
827
924
|
|
|
828
925
|
// 4. List all sessions
|
|
@@ -831,12 +928,12 @@ await callTool("session_list", { cli: "claude" });
|
|
|
831
928
|
// 5. Switch to a different session
|
|
832
929
|
await callTool("session_set_active", {
|
|
833
930
|
cli: "claude",
|
|
834
|
-
sessionId: "some-other-session-id"
|
|
931
|
+
sessionId: "some-other-session-id",
|
|
835
932
|
});
|
|
836
933
|
|
|
837
934
|
// 6. Delete when done
|
|
838
935
|
await callTool("session_delete", {
|
|
839
|
-
sessionId: "session-id-to-delete"
|
|
936
|
+
sessionId: "session-id-to-delete",
|
|
840
937
|
});
|
|
841
938
|
```
|
|
842
939
|
|
|
@@ -864,6 +961,7 @@ await callTool("session_delete", {
|
|
|
864
961
|
### CLI-Specific Settings
|
|
865
962
|
|
|
866
963
|
Each CLI can be configured through its own configuration files:
|
|
964
|
+
|
|
867
965
|
- Claude Code: `~/.claude/config.json`
|
|
868
966
|
- Codex: `~/.codex/config.toml`
|
|
869
967
|
- Gemini: `~/.gemini/config.json`
|
|
@@ -939,6 +1037,7 @@ npm start
|
|
|
939
1037
|
The gateway provides detailed error messages for common issues:
|
|
940
1038
|
|
|
941
1039
|
### CLI Not Found
|
|
1040
|
+
|
|
942
1041
|
```
|
|
943
1042
|
Error executing claude CLI:
|
|
944
1043
|
spawn claude ENOENT
|
|
@@ -947,12 +1046,14 @@ The 'claude' command was not found. Please ensure claude CLI is installed and in
|
|
|
947
1046
|
```
|
|
948
1047
|
|
|
949
1048
|
### External Timeout / Legacy Timeout Option
|
|
1049
|
+
|
|
950
1050
|
```
|
|
951
1051
|
Error executing codex CLI: Command timed out
|
|
952
1052
|
Process timed out after 120000ms
|
|
953
1053
|
```
|
|
954
1054
|
|
|
955
1055
|
### Invalid Parameters
|
|
1056
|
+
|
|
956
1057
|
```
|
|
957
1058
|
Prompt cannot be empty
|
|
958
1059
|
Prompt too long (max 100k chars)
|
|
@@ -970,6 +1071,7 @@ Logs are written to stderr (stdout is reserved for MCP protocol):
|
|
|
970
1071
|
```
|
|
971
1072
|
|
|
972
1073
|
Enable debug logging:
|
|
1074
|
+
|
|
973
1075
|
```bash
|
|
974
1076
|
DEBUG=1 node dist/index.js
|
|
975
1077
|
```
|
|
@@ -979,6 +1081,7 @@ DEBUG=1 node dist/index.js
|
|
|
979
1081
|
### CLIs Not Found
|
|
980
1082
|
|
|
981
1083
|
Make sure the CLIs are installed and in your PATH:
|
|
1084
|
+
|
|
982
1085
|
```bash
|
|
983
1086
|
which claude
|
|
984
1087
|
which codex
|
|
@@ -986,6 +1089,7 @@ which gemini
|
|
|
986
1089
|
```
|
|
987
1090
|
|
|
988
1091
|
The gateway extends PATH to include common locations:
|
|
1092
|
+
|
|
989
1093
|
- `~/.local/bin`
|
|
990
1094
|
- `/usr/local/bin`
|
|
991
1095
|
- `/usr/bin`
|
|
@@ -994,6 +1098,7 @@ The gateway extends PATH to include common locations:
|
|
|
994
1098
|
### Permission Errors
|
|
995
1099
|
|
|
996
1100
|
If you encounter permission errors, ensure the CLI tools have proper permissions:
|
|
1101
|
+
|
|
997
1102
|
```bash
|
|
998
1103
|
chmod +x $(which claude)
|
|
999
1104
|
chmod +x $(which codex)
|
|
@@ -1005,16 +1110,19 @@ chmod +x $(which gemini)
|
|
|
1005
1110
|
Sessions are stored in `~/.llm-cli-gateway/sessions.json`. If you encounter issues:
|
|
1006
1111
|
|
|
1007
1112
|
1. Check file permissions:
|
|
1113
|
+
|
|
1008
1114
|
```bash
|
|
1009
1115
|
ls -la ~/.llm-cli-gateway/
|
|
1010
1116
|
```
|
|
1011
1117
|
|
|
1012
1118
|
2. Reset sessions:
|
|
1119
|
+
|
|
1013
1120
|
```bash
|
|
1014
1121
|
rm ~/.llm-cli-gateway/sessions.json
|
|
1015
1122
|
```
|
|
1016
1123
|
|
|
1017
1124
|
3. Or manually edit the session file:
|
|
1125
|
+
|
|
1018
1126
|
```bash
|
|
1019
1127
|
cat ~/.llm-cli-gateway/sessions.json
|
|
1020
1128
|
```
|
|
@@ -1038,19 +1146,20 @@ The gateway supports concurrent requests across different CLIs. Each request spa
|
|
|
1038
1146
|
- **No Eval**: No dynamic code evaluation in our source (see "Socket alerts" below for the transitive `ajv` codegen case)
|
|
1039
1147
|
- **Sandboxing**: Consider running in containers for production use
|
|
1040
1148
|
- **Provenance**: Releases are published with [npm provenance](https://docs.npmjs.com/generating-provenance-statements) via OIDC trusted publishing from GitHub Actions
|
|
1149
|
+
- **Release signing**: GitHub release installer artifacts are signed with Sigstore keyless signing; verify `SHA256SUMS.sigstore.json` before trusting the checksum file
|
|
1041
1150
|
|
|
1042
1151
|
### Socket alerts — context for reviewers
|
|
1043
1152
|
|
|
1044
1153
|
If you're vetting `llm-cli-gateway` through [Socket](https://socket.dev/npm/package/llm-cli-gateway) or a similar supply-chain scanner, you'll see three behavioural alerts and some dependency-ownership alerts. They are accurate descriptions of what the package does and what it depends on; we've left them visible (not silenced in `socket.yml`) so you don't have to take our word for it. Here's the context for each:
|
|
1045
1154
|
|
|
1046
|
-
| Alert
|
|
1047
|
-
|
|
1048
|
-
| **Network access**
|
|
1049
|
-
| **Shell access**
|
|
1050
|
-
| **Uses eval**
|
|
1051
|
-
| **better-sqlite3 PRAGMA helper** | Transitive: `better-sqlite3/lib/methods/pragma.js` interpolates its caller-provided `source` into a `PRAGMA ${source}` statement.
|
|
1052
|
-
| **ioredis obfuscated code**
|
|
1053
|
-
| **Dependency ownership**
|
|
1155
|
+
| Alert | Where | Why it's bounded |
|
|
1156
|
+
| -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
1157
|
+
| **Network access** | `src/http-transport.ts` opens an HTTP MCP transport when started via `npm run start:http`. `src/endpoint-exposure.ts` issues a HEAD probe to verify configured public/tunnel URLs. | The transport binds to `127.0.0.1` by default and requires `LLM_GATEWAY_AUTH_TOKEN` to be set. The default stdio MCP entry point (`npm start`) opens no sockets. |
|
|
1158
|
+
| **Shell access** | `src/executor.ts` uses `child_process.spawn(cmd, args, …)` to invoke the underlying LLM CLIs. | `spawn` is called with an argument array and **never** `shell: true`, so there is no shell interpolation path for caller input. The command name is restricted to an allow-list of known CLI binaries (`claude`, `codex`, `gemini`, `grok`, `vibe`). |
|
|
1159
|
+
| **Uses eval** | None in our source. Transitive: `@modelcontextprotocol/sdk` → `ajv@8` uses `new Function(...)` in `ajv/dist/compile/index.js` to compile JSON Schema validators. | This is ajv's standard codegen path. Only known schemas (defined in our source and the MCP SDK) flow into it; no caller-supplied data ever reaches the compiled function body. |
|
|
1160
|
+
| **better-sqlite3 PRAGMA helper** | Transitive: `better-sqlite3/lib/methods/pragma.js` interpolates its caller-provided `source` into a `PRAGMA ${source}` statement. | We do not call `db.pragma()` from production source. Internal SQLite setup uses fixed literal `db.exec("PRAGMA ...")` statements, and `npm run security:audit` fails the release if production code reintroduces `.pragma()` calls. |
|
|
1161
|
+
| **ioredis obfuscated code** | Optional peer/dev dependency: `ioredis@5.10.1` may be flagged at `built/constants/TLSProfiles.js` for base64-looking strings. | Reviewed as a false positive. The file is a Redis Cloud TLS CA certificate bundle in PEM format, which is base64 by design. It contains no decoder loop, dynamic evaluation, network call, or hidden execution path. The same file is byte-for-byte identical in `ioredis@5.9.2`; our default production install does not install `ioredis`, and our code does not pass ioredis TLS profile options. |
|
|
1162
|
+
| **Dependency ownership** | A handful of small transitive packages (e.g. `bindings` via `better-sqlite3`, `media-typer` via `@modelcontextprotocol/sdk`) trip Socket's "unstable ownership" or "obfuscated code" heuristics. | These are pinned, well-known micro-deps in the Node ecosystem with no known issues. We pin direct override versions of `content-type` and `type-is` in `package.json#overrides`. Our previous direct dependency on `toml@3.0.0` (also single-maintainer, last released 2020) was replaced with the actively-maintained `smol-toml` to reduce inherited risk. |
|
|
1054
1163
|
|
|
1055
1164
|
See [`socket.yml`](./socket.yml) for the same context in machine-readable form.
|
|
1056
1165
|
|
|
@@ -1070,6 +1179,7 @@ MIT. See [LICENSE](LICENSE) for details.
|
|
|
1070
1179
|
## Support
|
|
1071
1180
|
|
|
1072
1181
|
For issues and questions:
|
|
1182
|
+
|
|
1073
1183
|
- Open an issue on GitHub
|
|
1074
1184
|
- Check existing issues and documentation
|
|
1075
1185
|
- Review CLI-specific documentation for CLI-related problems
|